Új hozzászólás Aktív témák
-
S_x96x_S
addikt
A ZEN4-es szoftveres támogatás elég lassú ....
de azért halad ...GCC 13 Now Enables 512-bit Vector For AMD Zen 4 Tuning
https://www.phoronix.com/news/GCC-13-Zen-4-Znver4-512b-VectorEnable 512 bit vector for zen4
While internally 512 registers are splits into two 256 halves, 512 bit vectors
reduces number of instructions to retire and has chance to improve paralelism.
There are few tsvc benchmarks that improves significantly:
runtime
benchmark 256bit 512bit
s2275 48.57 20.67 -58%
s311 32.29 16.06 -50%
s312 32.30 16.07 -50%
vsumr 32.30 16.07 -50%
s314 10.77 5.42 -50%
s313 21.52 10.85 -50%
vdotr 43.05 21.69 -50%
s316 10.80 5.64 -48%
s235 61.72 33.91 -45%
s161 15.91 9.95 -38%
s3251 32.13 20.31 -36%
And there are no benchmarks with off-noise regression. The basic matrix
multiplication loop improves by 32%. It is also expected that 512 bit
vectors are more power effecient (I can't masure that).
The down side is that loops with low trip counts may get slower when the
unvectorized prologue and epilogue is hit more often. With SPECfp this
problem happens with x264 (12% regression) and bwaves (6% regression)
and this is tracked in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108410
and will need more work on vectorizer to support masked epilogues.
After some additional testing it seems that using 512 bit vectors by
default is now overall better choice.
Új hozzászólás Aktív témák
- LG 27UL500-W - 27" IPS - 3840x2160 4K - 60Hz 5ms - HDR10 - AMD FreeSync - 300 Nits - sRGB 99%
- ÁRGARANCIA! Épített KomPhone i5 13400F 32/64GB RAM RX 7700 XT 12GB GAMER PC termékbeszámítással
- LG 65" C1 OLED - 4K 120Hz 1ms - NVIDIA G-Sync - FreeSync Premium - HDMI 2.1 - PS5 és Xbox Ready!
- Apple iPhone 12 Pro Max 128GB Kártyafüggetlen 1Év Garanciával
- BESZÁMÍTÁS! MSI B450M R5 3600 16GB DDR4 512GB SSD RTX 2060 Super 8GB THERMALTAKE Core V21 500W
Állásajánlatok
Cég: Promenade Publishing House Kft.
Város: Budapest
Cég: CAMERA-PRO Hungary Kft
Város: Budapest