News:

If you are a member of the Team on BOINC you still need to register on this forum to see the member posts.  The posts available for visitors are not posted to much by members.
 Remember to answer the questions when Registering and also you must be a active member of Team BOINC@AUSTRALIA on BOINC.

Main Menu

Bizarrely low flops

Started by TouchuvGrey, January 16, 2021, 10:45:27 AM

Previous topic - Next topic

TouchuvGrey

Can anyone explain to me what this means and what if anything
needs to be done ?

Estimated AMD GPU GFLOP/s: 490 SP GFLOP/s, 98 DP FLOP/s

Warning: Bizarrely low flops (97). Defaulting to 100

Using a target frequency of 60.0
Using a block size of 9216 with 4 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range:          { nu_steps = 320, mu_steps = 800, r_steps = 700 }
Iteration area: 560000
Chunk estimate: 14
Num chunks:     16
Chunk size:     36864
Added area:     29824
Effective area: 589824
Initial wait:   12 ms
Integration time: 20.841387 s. Average time per iteration = 65.129333 ms
Integral 0 time = 21.022696 s
Estimated AMD GPU GFLOP/s: 490 SP GFLOP/s, 98 DP FLOP/s
Warning: Bizarrely low flops (97). Defaulting to 100
Using a target frequency of 60.0
Using a block size of 9216 with 4 blocks/chunk
Using clWaitForEvents() for polling (mode -1)
Range:          { nu_steps = 320, mu_steps = 58, r_steps = 700 }
Iteration area: 40600
Chunk estimate: 1
Num chunks:     2
Chunk size:     36864
Added area:     33128
Effective area: 73728
Initial wait:   13 ms
Integration time: 1.805974 s. Average time per iteration = 5.643669 ms
Integral 1 time = 1.840047 s
Running likelihood with 31815 stars
Likelihood time = 0.519370 s
<background_integral> 0.000046791783021 </background_integral>
<stream_integral>  30.709319636815273  32.378013175681566  105.752119123452120  131.461939979897660 </stream_integral>
<background_likelihood> -3.252500236618621 </background_likelihood>
<stream_only_likelihood>  -5.907225385230095  -3.364735027514926  -3.981496972927704  -6.686684229676112 </stream_only_likelihood>
<search_likelihood> -2.704481373328299 </search_likelihood>
Using SSE4.1 path
Found 1 platform
Platform 0 information:
  Name:       AMD Accelerated Parallel Processing
  Version:    OpenCL 2.1 AMD-APP (3188.4)
  Vendor:     Advanced Micro Devices, Inc.
  Extensions: cl_khr_icd cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_amd_event_callback cl_amd_offline_devices
  Profile:    FULL_PROFILE
Using device 0 on platform 0
Found 2 CL devices
Device 'Ellesmere' (Advanced Micro Devices, Inc.:0x1002) (CL_DEVICE_TYPE_GPU)
Board: Radeon RX 580 Series
Driver version:      3188.4
Version:             OpenCL 1.2 AMD-APP (3188.4)
Compute capability:  0.0
Max compute units:   36
Clock frequency:     1360 Mhz
Global mem size:     3221225472
Local mem size:      32768
Max const buf size:  3221225472
Double extension:    cl_khr_fp64
Build log:

TouchuvGrey

#1
This id coproc_info.xml  for machine #1 with the 2 RX580's
is there anything here that can be modified for better performance ?

<?xml version="1.0"?>

-<coprocs>


-<ati_opencl>

<name>Radeon RX 580 Series</name>

<vendor>Advanced Micro Devices, Inc.</vendor>

<vendor_id>4098</vendor_id>

<available>1</available>

<half_fp_config>0</half_fp_config>

<single_fp_config>190</single_fp_config>

<double_fp_config>63</double_fp_config>

<endian_little>1</endian_little>

<execution_capabilities>1</execution_capabilities>

<extensions>cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_planar_yuv</extensions>

<global_mem_size>8589934592</global_mem_size>

<local_mem_size>32768</local_mem_size>

<max_clock_frequency>1360</max_clock_frequency>

<max_compute_units>36</max_compute_units>

<nv_compute_capability_major>0</nv_compute_capability_major>

<nv_compute_capability_minor>0</nv_compute_capability_minor>

<amd_simd_per_compute_unit>4</amd_simd_per_compute_unit>

<amd_simd_width>16</amd_simd_width>

<amd_simd_instruction_width>1</amd_simd_instruction_width>

<opencl_platform_version>OpenCL 2.1 AMD-APP (3188.4)</opencl_platform_version>

<opencl_device_version>OpenCL 2.0 AMD-APP (3188.4)</opencl_device_version>

<opencl_driver_version>3188.4</opencl_driver_version>

<device_num>0</device_num>

<peak_flops>6266880000000.000000</peak_flops>

<opencl_available_ram>8589934592.000000</opencl_available_ram>

<opencl_device_index>0</opencl_device_index>

<warn_bad_cuda>0</warn_bad_cuda>

</ati_opencl>


-<ati_opencl>

<name>Radeon RX 580 Series</name>

<vendor>Advanced Micro Devices, Inc.</vendor>

<vendor_id>4098</vendor_id>

<available>1</available>

<half_fp_config>0</half_fp_config>

<single_fp_config>190</single_fp_config>

<double_fp_config>63</double_fp_config>

<endian_little>1</endian_little>

<execution_capabilities>1</execution_capabilities>

<extensions>cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_planar_yuv</extensions>

<global_mem_size>8589934592</global_mem_size>

<local_mem_size>32768</local_mem_size>

<max_clock_frequency>300</max_clock_frequency>

<max_compute_units>36</max_compute_units>

<nv_compute_capability_major>0</nv_compute_capability_major>

<nv_compute_capability_minor>0</nv_compute_capability_minor>

<amd_simd_per_compute_unit>4</amd_simd_per_compute_unit>

<amd_simd_width>16</amd_simd_width>

<amd_simd_instruction_width>1</amd_simd_instruction_width>

<opencl_platform_version>OpenCL 2.1 AMD-APP (3188.4)</opencl_platform_version>

<opencl_device_version>OpenCL 2.0 AMD-APP (3188.4)</opencl_device_version>

<opencl_driver_version>3188.4</opencl_driver_version>

<device_num>1</device_num>

<peak_flops>1382400000000.000000</peak_flops>

<opencl_available_ram>8589934592.000000</opencl_available_ram>

<opencl_device_index>1</opencl_device_index>

<warn_bad_cuda>0</warn_bad_cuda>

</ati_opencl>

<warning>NVIDIA drivers present but no GPUs found</warning>

<warning>No ATI library found.</warning>

</coprocs>

machine # 2

<?xml version="1.0"?>

-<coprocs>


-<ati_opencl>

<name>Radeon RX 5500 XT</name>

<vendor>Advanced Micro Devices, Inc.</vendor>

<vendor_id>4098</vendor_id>

<available>1</available>

<half_fp_config>0</half_fp_config>

<single_fp_config>191</single_fp_config>

<double_fp_config>63</double_fp_config>

<endian_little>1</endian_little>

<execution_capabilities>1</execution_capabilities>

<extensions>cl_khr_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_media_ops cl_amd_media_ops2 cl_khr_d3d10_sharing cl_khr_d3d11_sharing cl_khr_dx9_media_sharing cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_liquid_flash cl_amd_copy_buffer_p2p cl_amd_planar_yuv</extensions>

<global_mem_size>8573157376</global_mem_size>

<local_mem_size>65536</local_mem_size>

<max_clock_frequency>1733</max_clock_frequency>

<max_compute_units>11</max_compute_units>

<nv_compute_capability_major>0</nv_compute_capability_major>

<nv_compute_capability_minor>0</nv_compute_capability_minor>

<amd_simd_per_compute_unit>4</amd_simd_per_compute_unit>

<amd_simd_width>32</amd_simd_width>

<amd_simd_instruction_width>1</amd_simd_instruction_width>

<opencl_platform_version>OpenCL 2.1 AMD-APP (3188.4)</opencl_platform_version>

<opencl_device_version>OpenCL 2.0 AMD-APP (3188.4)</opencl_device_version>

<opencl_driver_version>3188.4 (PAL,LC)</opencl_driver_version>

<device_num>0</device_num>

<peak_flops>4880128000000.000000</peak_flops>

<opencl_available_ram>8573157376.000000</opencl_available_ram>

<opencl_device_index>0</opencl_device_index>

<warn_bad_cuda>0</warn_bad_cuda>

</ati_opencl>


-<opencl_cpu_prop>

<platform_vendor>Intel(R) Corporation</platform_vendor>


-<opencl_cpu_info>

<name>Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz</name>

<vendor>Intel(R) Corporation</vendor>

<vendor_id>32902</vendor_id>

<available>1</available>

<half_fp_config>0</half_fp_config>

<single_fp_config>7</single_fp_config>

<double_fp_config>63</double_fp_config>

<endian_little>1</endian_little>

<execution_capabilities>3</execution_capabilities>

<extensions>cl_khr_icd cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_byte_addressable_store cl_khr_depth_images cl_khr_3d_image_writes cl_intel_exec_by_local_thread cl_khr_spir cl_khr_dx9_media_sharing cl_intel_dx9_media_sharing cl_khr_d3d11_sharing cl_khr_gl_sharing cl_khr_fp64 </extensions>

<global_mem_size>4239839232</global_mem_size>

<local_mem_size>32768</local_mem_size>

<max_clock_frequency>3200</max_clock_frequency>

<max_compute_units>4</max_compute_units>

<nv_compute_capability_major>0</nv_compute_capability_major>

<nv_compute_capability_minor>0</nv_compute_capability_minor>

<amd_simd_per_compute_unit>0</amd_simd_per_compute_unit>

<amd_simd_width>0</amd_simd_width>

<amd_simd_instruction_width>0</amd_simd_instruction_width>

<opencl_platform_version>OpenCL 1.2 </opencl_platform_version>

<opencl_device_version>OpenCL 1.2 (Build 10094)</opencl_device_version>

<opencl_driver_version>5.2.0.10094</opencl_driver_version>

</opencl_cpu_info>

</opencl_cpu_prop>

<warning>NVIDIA drivers present but no GPUs found</warning>

<warning>No ATI library found.</warning>

</coprocs>

chooka03

All I can say is I have no idea sorry lol

tazzduke

Hi TouchuvGray

One thing with the newer AMD cards is their Double Precision capability is reduced with each generation, as they are aimed at Gamers, which puts us crunchers at a dis asdvantage, since Milkyway uses DP.

I did notice that in the above printout , it gave a warning about NVIDIA drivers present but no NVIDIA Gpu found.
Maybe it might be wise to do a full cleanout of the drivers and then reload the AMD drivers.
There is a program c aall DDU via guru3d.com that does this really well.

Also these cards do reasonably well over at Einstein@home, maybe use that as a backup project.

Also does this computer have 3 AMD gpus in it.

I suppose you could  check on Milkyway forums to see if your cards can do 2 workunits at a time.

Cheers



 AA 24 - 53 participant

TouchuvGrey

i have removed the drivers and re installed the AMD drivers on
Machine #1, the one with the two RX580's in it

Will do so on # 2, the machine with the RX5500XT in it later today.

i had been hoping for a lot more than i am getting from the 5500XT
based on core clock, memory clock, and GDDR6, vs GDDR5.

i think i will wait a few days to see if there is any improvement and then
swap one of the RX580's over to machine #2 and put the 5500XT in machine #1
just to see what happens.

tazzduke

Hey your RX580s are smashing it, but your 5500xt is running times similar to an RX570.

Had a quick look at some computers running similar hardware over at Milkyway at home.




 AA 24 - 53 participant

chooka03

The RX580 has 362.0 - 385.9 double precision.
The 5500XT has 302.2 - 324.7 double precision.

On that basis I would think the RX580 would be better for M@H.

This list seems to confirm that.

https://milkyway.cs.rpi.edu/milkyway/gpu_list.php

TouchuvGrey

Silly me, and here i was thinking that a newer next generation
card would be better.

i have the RX580's running 2 WU's each now.  So far it looks
like about 35% improvement.

:boom:

TouchuvGrey

Possibly a dumb question. Would an older  card like a HD7970

FP64 (double) performance
    947.2 GFLOPS (1:4)

be even better with the higher double precision capability
despite the slower GPU clock and memory clock ?

tazzduke

Greetings

I think it would be better in crunching times, but it is known to be very power hungry do you own the card?, as I wouldn't buy one these days.

Maybe check on MW forums, there is couple of threads dealing with GPU's.

Its been a long time since I ran MW.

Regards




 AA 24 - 53 participant

TouchuvGrey

i don't own one but i can get one for $100.00 AUD on Gumtree.

tazzduke

Hi TouchuvGrey

Sounds okay, but I would be bargaining for 50 dollars at the most, otherwise I would be putting towards another RX 580 onwards.

You might only get 6 months out of that 7970 but you might get 2 years out of another RX580

The other thing is to look at is, energy efficiency on your GPU crunching.

My thoughts only, but others here who run AMD cards and do Milkyway, might be able to offer other solutions.

Cheers

PS, I only run NVIDIA Cards (at the moment), when the weather cools down, I run GPUGRID and Primegrid and the 1660 Super I have, does well for the low watts it uses. (Einstein sometimes as well)




 AA 24 - 53 participant

Sean

The results from WUProp@home are good for comparing graphics cards:
http://wuprop.boinc-af.org/results/delai.py


Here are some average (min-max) run times in minutes for Milkyway on Windows 64 (I don't run Milkyway so hopefully these results seem realistic).

HD 7970: 2.5 (0.1-7.3)
RX 580: 2.0 (0.8-7.5)
RX 5500 XT: 3.2 (1.4-5.4)

TouchuvGrey

#13
if i go with another RX580 it looks like it will have to be a used one.
They seem to be discontinued.   : (

Anybody out there have an RX 580 they want to sell ?   or an R9 280   
the numbers on that look good.

tazzduke

Sorry I dont have an RX580, but I have seen some depending on where in Australia, range from $140-$200 on the 2nd hand market.

Seems though that due to the short supply of new AMD RX 6000 series cards, this has blown out the 2nd hand market prices.

A little bit of detail also to help you out - Also depends on how much you want or have to spend as well.

HD 7970 (2011 Card) Over 200 Watts
R9 280X (2013 Card) Over 240 Watts
R9 290X (2013 Card) Over 280 Watts - Very Hot to RUn
RX 580 (2017 Card)   Over 190 Watts
Radeon VII (2019 Card) Over 300 Watts - Very Expensive but its the KING

See If you could pickup a 2nd hand RX 580 for $150, thats better than spending money on a 2nd Hand 2011 or 2013 card, imho.

This is what I could pull off some websites and also Milkyway at home.

Cheers



 AA 24 - 53 participant