ไลนัสสาป AVX-512 ไปตาย ชี้ Intel เอางบไปเพิ่มคอร์ซีพียูแบบ AMD ดีกว่า

By mheevariety

on 14 July 2020 - 13:36 Tag: Linus Torvalds, Programming, GCC, Intel

Linus Torvalds

Linus Torvalds สาปส่งชุดคำสั่ง AVX-512 หลังพบว่าซีพียู Alder Lake ไม่มีฟีเจอร์นี้ โดยระบุว่า “ผมหวังว่า AVX512 มันจะตายอย่างเจ็บปวดไปซะที” พร้อมเสนอว่า Intel ควรใช้พื้นที่ชิปทำอย่างอื่น เช่นการเพิ่มประสิทธิภาพคอร์หรือเพิ่มจำนวนคอร์แบบเอเอ็มดี

เขาระบุว่าว่า Intel ควรจะกลับไปมองโปรเซสเซอร์ ไม่ใช่เอาแต่คิดชุดคำสั่งใหม่ๆ แล้วก็สร้าง benchmarks มาวัดผลเพื่อให้ชิปตัวเองดูดี เขาอยากให้ Intel กลับไปแก้ไขปัญหาพื้นๆ แล้วโฟกัสกับการทำงานของโค้ดปกติที่ไม่ใช่ HPC หรือรูปแบบพิเศษอื่นๆ มากกว่านี้ เลิกโฟกัสที่การคำนวณเลขทศนิยม (FP) ซะที เพราะมันไม่ได้สำคัญต่อผู้ใช้ขนาดนั้น สมัยก่อนที่ x86 ยังรุ่งๆ คู่แข่ง Intel มีประสิทธิภาพในการคำนวณเลขทศนิยมสูงกว่าทั้งนั้น แต่ Intel ก็ยังถล่มคู่แข่งได้

ชุดคำสั่ง AVX-512 เป็นชุดคำสั่งประมวลผลแบบเวคเตอร์ขนาดใหญ่ เดิมเริ่มใช้งานในชิป Xeon Phi ที่เป็นการ์ดเร่งความเร็วสำหรับงานคอมพิวเตอร์ประสิทธิภาพสูง (HPC) โดยเฉพาะ แต่อินเทลเริ่มใส่เข้ามาในซีพียูหลายรุ่นในช่วงหลัง

Linus ยอมรับว่าเขาอาจจะลำเอียงไปบ้าง เพราะไม่ชอบ benchmarks ที่วัดประสิทธิภาพการประมวลผลเลขทศนิยมเอาซะเลย แต่ก็ยังมองว่า AVX-512 เป็นสิ่งที่ Intel เดินเกมพลาด และ Intel ควรเลิกโฟกัสกับอะไรแบบพิเศษๆ ที่ไม่มีใครสนใจ ไปเอาจริงเอาจังกับเรื่องพื้นฐานของโปรเซสเซอร์อย่างการรันโค้ดทั่วไปให้ดีที่สุดมากกว่า แล้วค่อยแปะ FPU (ตัวประมวลผลเลขทศนิยม) ที่พอใช้ได้เข้ามา แค่นี้คนก็แฮปปี้แล้ว และ AVX2 มันก็เพียงพอแล้วเหมือนกัน

ที่มา - Phronix, Real World Technologies

ภาพ Linus Torvalds จากวิดีโอของ The Linux Foundation

Hiring! บริษัทที่น่าสนใจ

Carmen Software

Hotel Financial Solutions

Next Innovation (Thailand) Co., Ltd.

We are web design with consulting & engineering services driven the future stronger and flexibility.

KKP Dime

KKP Dime บริษัทในเครือเกียรตินาคินภัทร

Kiatnakin Phatra Financial Group

Financial Service

Fastwork Technologies

Fastwork.co เว็บไซต์ที่รวบรวม ฟรีแลนซ์ มืออาชีพจากหลากหลายสายงานไว้ในที่เดียวกัน

Thoughtworks Thailand

Thoughtworks เป็นบริษัทที่ปรึกษาด้านเทคโนโยลีระดับโลกที่คว้า Great Place to Work 3 ปีซ้อน

Iron Software

Iron Software is an American company providing a suite of .NET libraries by engineer for engineers.

CLEVERSE

Cleverse is a Venture Builder. Our team builds several tech companies.

Nipa Cloud

#1 OpenStack cloud provider in Thailand with our own data center and software platform.

Bangmod Enterprise

The leader in Cloud Server and Hosting in Thailand.

CIMB THAI Bank

MOVING FORWARD WITH YOU - CIMB is the leading ASEAN Bank

Bangkok Bank

Bangkok Bank is one of Southeast Asia's largest regional banks, a market leader in business banking

MuvMi (Urban Mobility Tech Co.,Ltd.)

Shape the future of urban mobility towards affordable, clean, and safe solutions

T.N. Digital Solution Co., Ltd.

TNDS has been involving in every first move of banking’s major digital transformation.

KBTG - KASIKORN Business-Technology Group

KBTG - "The Technology Company for Digital Business Innovation"

Siam Commercial Bank Public Company Limited

"Let's start a brighter career future together"

Icon Framework co.,Ltd.

Global Standard Platform for Real Estate แพลตฟอร์มสำหรับธุรกิจอสังหาริมทรัพย์ครบวงจร มาตรฐานระดับโลก

REFINITIV

The Financial and Risk business of Thomson Reuters is now Refinitiv

H LAB

Re-engineering healthcare systems through intelligent platforms and system design.

The Gang Technology Co., Ltd.

We're a Digital Agency that helps our customers transform their business into digital with ease.

LTMH

LTMH มุ่งเน้นการพัฒนาผลิตภัณฑ์ที่สามารถช่วยพันธมิตรของเราให้บรรลุเป้าหมาย

Seven Peaks

We Drive Digital Transformation

Wisesight (Thailand) Co., Ltd.

The Best Choice For Handling Social Media · High Expertise in Social Data · Most Advanced and Secure

MOLOG Tech

We are Modern Logistic Platform, Specialize in WMS, OMS and TMS.

Data Wow Co.,Ltd

We enable our clients to realize increased productivity by solving their most complex issues by Data

LINE Company Thailand

LINE, the world's hottest mobile messaging platform, offers free text and voice messaging + Call

LINE MAN Wongnai

Join our journey to becoming No.1 food platform in Thailand

ถึงว่า แก ไปซื่อ amd

tom789 Tue, 14/07/2020 - 13:57

ถึงว่า แก ไปซื่อ amd

ก่อนหน้านี้แกก็ใช้ Intel

gololo Tue, 14/07/2020 - 19:08

ก่อนหน้านี้แกก็ใช้ Intel สำหรับแกก็ถ้าอะไรเหมาะสมกับงานทำให้งานเสร็จเร็วขึ้นแกก็น่าจะใช้อันนั้นไม่หน้าเกี่ยวกับชุดคำสั่งนี้

เอาจริงๆวิธีเพิ่มชุดคำสั่งเฉพาะเป็นวิธีที่เร็วที่สุดที่ทำให้โปรแกรมที่ใช้งานมันเร็วขึ้นอย่างผิดหูผิดตา จากเดิมต้องใช้ชุดคำสั่งปรกติสร้างต้องวนลูปหลายลูปกลายเป็นเสร็จในคำสั่งเดียว และให้ประสิธิภาพสูงกว่าการเพิ่มคอร์ถ้ามองแค่งานนั้นๆ และถ้าเป็นงานที่ถูกใช้ในโปรแกรม Benchmark ก็เอาคะแนนไปขายได้อีก (เดาว่าappleน่าจะทำวิธีนี้อยู่เพราะดูในวิดีโอเปลี่ยบ speedtest ไม่ได้เร็วเหมือนโปรแกรม Benchmark ที่ต่างกัน)

ปรกติเค้าจะวิเคราะห์ว่างานใหนใช้บ่อยในตลาด cpu นั้นๆก็จะสร้างชุดคำสั่งนั้นเข้ามาใน cpu การยกชุดคำสั่งในเรือธงเอามาใส่ใน cpu ล่างก็เป็นเรื่องปรกติของคนหมดหนทางสู้หรือเป็นเรื่องปรกติไม่แน่ใจ ถ้าผู้ใช้ได้ประโยชน์ก็โอเค แต่น่าจะเหนื่อยสำหรับคนพอตชุดคำสังและตรวจสอบรวมคนที่มีส่วนดูแลเคอเลลอย่างแก

Intel อย่าหาทำแบบ AMD FX

osmiumwo1f Tue, 14/07/2020 - 13:57

Intel อย่าหาทำแบบ AMD FX ที่เล่นให้ FPU ต่อ 2 core ละกัน

เบิ้ล core แต่ให้ FPU

bosszz Tue, 14/07/2020 - 15:24

เบิ้ล core แต่ให้ FPU เท่าเดิม ไม่ต่างอะไรกับ Intel HT/AMD SMT หรอก ที่มี Execution Unit ชุดเดียว

จัดไปหนึ่งดอกใหญ่

Quinn Tue, 14/07/2020 - 14:17

จัดไปหนึ่งดอกใหญ่

อันนี้เห็นด้วยกับLinuxนะ

kernelbase Tue, 14/07/2020 - 17:24

อันนี้เห็นด้วยกับLinusนะ

Ceo แห่ง ocz

sabayjoo_ Tue, 14/07/2020 - 14:38

Ceo แห่ง ocz ได้อ่านข่าวนี้จะว่ายังไงนะ เห็นอวย Avx512 สุดๆ

AVX 512 เทพสุดในสามโลก :P

K_AViar Tue, 14/07/2020 - 23:39

AVX 512 เทพสุดในสามโลก :P

แอพไหนรองรับ AVX-512 บ้าง?

rainhawk Tue, 14/07/2020 - 14:52

แอพไหนรองรับ AVX-512 บ้าง?

ข่าวนี้อ่านแล้วขำเลย

ozbee Tue, 14/07/2020 - 15:11

ข่าวนี้อ่านแล้วขำท้องแข็งเลย สาวกอวยกันมากเลยเจอด่าซะ !

นั่งดูชุดคำสั่ง AVX-512

mr_tawan Tue, 14/07/2020 - 15:15

นั่งดูชุดคำสั่ง AVX-512 ที่แต่ละชิพรองรับ แล้วพบว่า ... ไม่มีชิพตัวไหนซัพพอร์ตครบทุกคำสั่งสักตัว

ใครเขียนคอมไพลเลอร์ก็ขอให้โชคดีครับ 555

จะด่า AVX512 ก็ด่าไป

McKay Tue, 14/07/2020 - 15:27

จะด่า AVX512 ก็ด่าไป แต่มาด่ารวมถึง FPU/Vector Unit อื่นๆนี่ผมว่ากบในกะลาครับ

งานของคนอื่นไม่ได้มีแค่ compile kernel นะ เอาง่ายๆคนที่ซื้อคอมมาเล่นเกมส์เป็นหลัก ไม่ก็ทำงาน multimedia ต่างๆ นี่มีเยอะกว่าคนกลุ่มเดียวกับ Linus แน่ๆ ซึ่งพวกนี้มันใช้ FPU/Vector จัดอยู่แล้ว

เค้าบอกว่า AVX2 ก็พอแล้ว

mr_tawan Tue, 14/07/2020 - 17:22

เค้าบอกว่า AVX2 ก็พอแล้ว ไม่ได้เหมาว่าทุกอย่างมันแย่ไปหมดครับ ?

คือใจความของเค้าที่ผมจับได้คือ แทนที่จะไปโฟกัสเวิร์คโหลดที่คนส่วนใหญ่ไม่ได้ใช้ (อย่างการทำ Vector Operation ที่ขนาดใหญ่ขนาดนั้น) ไปโฟกัสในส่วนที่คนน่าจะได้ใช้ประโยชน์จริง ๆ (เช่นการเพิ่มคอร์) น่าจะดีกว่า

ปล. ผมเข้าไปดูใน discord ของ GD.net ไม่มีใครพูดถึง AVX-512 เลยแฮะ สงสัยไม่มีคนสนใจ 55

อ่านที่มาเลยครับ

McKay Tue, 14/07/2020 - 19:43

อ่านที่มาเลยครับ

I hope AVX512 dies a painful death, and that Intel starts fixing real problems instead of trying to create magic instructions to then create benchmarks that they can look good on.

I hope Intel gets back to basics: gets their process working again, and concentrate more on regular code that isn't HPC or some other pointless special case.

I've said this before, and I'll say it again: in the heyday of x86, when Intel was laughing all the way to the bank and killing all their competition, absolutely everybody else did better than Intel on FP loads. Intel's FP performance sucked (relatively speaking), and it matter not one iota.

Because absolutely nobody cares outside of benchmarks.

The same is largely true of AVX512 now - and in the future. Yes, you can find things that care. No, those things don't sell machines in the big picture.

And AVX512 has real downsides. I'd much rather see that transistor budget used on other things that are much more relevant. Even if it's still FP math (in the GPU, rather than AVX512). Or just give me more cores (with good single-thread performance, but without the garbage like AVX512) like AMD did.

I want my power limits to be reached with regular integer code, not with some AVX512 power virus that takes away top frequency (because people ended up using it for memcpy!) and takes away cores (because those useless garbage units take up space).

Yes, yes, I'm biased. I absolutely destest FP benchmarks, and I realize other people care deeply. I just think AVX512 is exactly the wrong thing to do. It's a pet peeve of mine. It's a prime example of something Intel has done wrong, partly by just increasing the fragmentation of the market.

Stop with the special-case garbage, and make all the core common stuff that everybody cares about run as well as you humanly can. Then do a FPU that is barely good enough on the side, and people will be happy. AVX2 is much more than enough.

Yeah, I'm grumpy.

Linus

ผมมองแค่นี้ครับ

mr_tawan Wed, 15/07/2020 - 00:04

ผมมองแค่นี้ครับ

Stop with the special-case garbage, and make all the core common stuff that everybody cares about run as well as you humanly can. Then do a FPU that is barely good enough on the side, and people will be happy. AVX2 is much more than enough.

ประโยคนั้นมันเป็นประโยคจิกกัด

McKay Wed, 15/07/2020 - 00:58

ถ้าอ่านมาทั้งหมดจะเข้าใจว่าประโยคนั้นมันเป็นประโยคจิกกัดเลยครับ ไม่ใช่ประโยคยอมรับ ถ้าเอาสั้นๆก็คือ

Stop with the special-case garbage, and make all the core common stuff that everybody cares about run as well as you humanly can. Then do a FPU that is barely good enough on the side, and people will be happy. AVX2 is much more than enough.

ฟีลประมาณ แค่ใส่ AVX2

mheevariety Wed, 15/07/2020 - 01:57

ฟีลประมาณ แค่ใส่ AVX2 มาก็มากเกินพอแล้ว (โว้ย)

ผมก็เห็นด้วยตามความคิดนี้ครับ

aeksael Tue, 14/07/2020 - 18:08

ผมก็เห็นด้วยตามความคิดนี้ครับ ตามยุคตามสมัยนิยมครับ

เค้าด่า FPU/Vector Unit อื่นๆ

aet Thu, 16/07/2020 - 17:24

เค้าด่า FPU/Vector Unit อื่นๆ ตรงไหนครับ?

ตรงที่ผมอธิบายไปแล้วด้านบนแต่

McKay Thu, 16/07/2020 - 19:04

ตรงที่ผมอธิบายไปแล้วด้านบนแต่เผอิญคุณไม่ได้อ่านครับ

ธรรมดาของแฟนบอย อวย

Fourpoint Tue, 14/07/2020 - 15:45

ธรรมดาของแฟนบอย อวย ของที่ตัวเองซื้อและหาข้อด้อยของคู่แข่ง :P (ผมก็เป็น)

ชุดคำสั่งใหม่ๆมีก็ดี ใครใช้ก็เร็ว ไม่ใช้ก็ไม่ใช้ แต่ถ้าไม่มีใครใช้เลย มันก็จะหายไปเอง ลองcpuพัฒนาแต่เพิ่ม core อัดclock สุดท้ายมันก็ตันเหมือนสมัยP4 สิ

ป.ล. benchmark มันก็คือการวัดค่าด้านใดด้านหนึ่ง สนใจด้านไหนก็ดูด้านนั้น ซื้อมาเล่นเกมก็ดูผลเทสเกม ไม่ใช่นั่งดูว่า render 3dได้เร็วกว่า ทั้งๆที่บางคนจนหมดประกันก็ยังไม่เคยใช้render แต่เล่นเกม 99.99%

ไม่ใช่บอกไม่ต้องดูตัวเลข แต่ใช้ความรู้สึก แบบนั้นมันโดนการตลาดหลอกเอามากกว่า :P

ป.ล.2 นึกถึง ray tracingตอน nvidiaออกใหม่ๆ ก็มีแฟนบอยอีกค่ายเหน็บแซะ ว่าไม่มีใครใช้ จะดูเงาสวยๆไปทำไม พอตอนนี้ ใส่ในมาตรฐานใหม่ PS5 ก็มี ก็ออกมาแก้ตัวว่า ไม่ต้องใช้Nvidiaก็ได้ คู่แข่งก็ใช้ได้แล้ว

ใส่คำสั่งใหม่ๆ

Bigkung Tue, 14/07/2020 - 16:05

ใส่คำสั่งใหม่ๆ แล้วเป็นช่องโหว่ในอนาคตได้หรือเปล่าน้า Intel

ประเด็นส่วนหนึ่งน่าจะมาจากตรง

e.p. Tue, 14/07/2020 - 16:12

ประเด็นส่วนหนึ่งน่าจะมาจากตรงนี้ด้วย ถ้ามีการใช้ AVX ความสัญญาณนาฬิกาสูงสุดจะตกลง หรือใช้ความเร็วสูงได้กับจำนวนคอร์น้อยลง (ดันมีคนเอาไปใช้ทำ memcpy ไม่ได้ใช้คำนวณหนักๆ อย่างที่น่าจะเอาไปใช้)

"I want my power limits to be reached with regular integer code, not with some AVX512 power virus that takes away top frequency (because people ended up using it for memcpy!) and takes away cores (because those useless garbage units take up space)."

ชุดคำสั่งใหม่ๆ ของ Intel

wichate Tue, 14/07/2020 - 16:13

ชุดคำสั่งใหม่ๆ ของ Intel มีไว้ให้ Hacker ใช้ครับ คนทั่วไปไม่ได้ใช้หรอก (รู้ตัวอีกทีโดนเจาะเสียแล้ว)

Linus ดูจะไบแอสมากไปหน่อย

takwing Tue, 14/07/2020 - 17:09

Linus ดูจะไบแอสมากไปหน่อย เพราะปกติแล้วใน Kernel ไม่มีการใช้ Floating-Point Instruction อยู่แล้ว
แต่ชาวบ้านชาวช่องหลายคนเค้าต้องใช้ไม่มากก็น้อย

เข้าไปดูชุดคำสั่ง AVX-512

mr_tawan Tue, 14/07/2020 - 17:29

เข้าไปดูชุดคำสั่ง AVX-512 เห็นว่ามากับ register ขนาด 512bit 32 ตัว

ความคิดแรกคือ แม่เจ้า นี่มัน register หรือ L1 Cache

แล้วก็นึกขึ้นได้ว่าลืมหาร 8 555

แต่พอมาคิดๆ เออแฮะ register

mr_tawan Tue, 14/07/2020 - 17:35

แต่พอมาคิดๆ เออแฮะ register ขนาด 2K มันก็ใช้พื้นที่บน die น่าจะเยอะอยู่ ...

เหมือนมีคนบ่นว่า

nessuchan Tue, 14/07/2020 - 17:30

เหมือนมีคนบ่นว่า มือถือเร็วพอแล้ว เอาเวลาไปพัฒนาแบตเตอรี่ให้อยู่ได้นาน ๆ ดีกว่า

ซึ่งผมว่าตอนนี้ก็ควรจะเป็นแบบ

IDCET Tue, 14/07/2020 - 17:41

ซึ่งผมว่าตอนนี้ก็ควรจะเป็นแบบนั้น เอาเวลาไปอัดพีเจอร์มาทำให้แบตอยู่ได้นานขึ้น ร้อนน้อยลง และกินแบตน้อยลง

ผมว่าเดี่ยวนี้มันเป็นนแบบ

TeamKiller Wed, 15/07/2020 - 00:15

ผมว่าเดี่ยวนี้มันเป็นนแบบ ร้อนเท่าเดิม กินแบตเท่าเดิม แต่แรงขึ้นนะครับ ยกเว้นรุ่นมีบัคงี้ ร้อนเป็นไฟ แบตไหลเป็นน้ำไรงี้

ขอบคุณศาสดา

Architec Tue, 14/07/2020 - 17:56

ขอบคุณศาสดา

เจอเฮียแกพูดแบบนี้ไป

Remma Tue, 14/07/2020 - 20:20

เจอเฮียแกพูดแบบนี้ไป ผมต้องมองเฮียใหม่แล้ว

ผมว่าคงต้องให้ใครชวนเฮียแกออกไปทำอย่างอื่นบ้างนะ เปิดโลกเปิดใจให้กับอะไรใหม่ๆบ้าง ไม่ใช่วันๆเอาแต่ compile kernel กับด่าโปรแกรมเมอร์คนอื่น

ถ้ามองด้านผู้บริโภคทั่วไป

Hoo Tue, 14/07/2020 - 20:56

ถ้ามองด้านผู้บริโภคทั่วไป
น่าจะตรงประเด็นสุดๆที่ทำให้ Intel หลังๆโดน AMD ตีตื้นขึ้นมา
คือกลับไปจัดการ เรื่องพื้นฐาน ที่หลังๆโดนวิบัติแฮกเกอร์ จนต้อง patch ลดความเร็วไปเยอะ

แถมงาน float point ปกติก็มี gpu อยู่แล้ว
ถึงไม่ใช่วงจรเฉพาะ แต่น่าจะคุ้มค่าต่อราคากว่า เมื่อมองในมุมผู้บริโภค
(คือนึกไม่ออกว่างานไหนในชีวิตประจำวันต้องการ float point 512bit)

เหตุผลที่ยังต้องมี SIMD

takwing Tue, 14/07/2020 - 23:54

เหตุผลที่ยังต้องมี SIMD Extension สำหรับ Floating-Point Operation เพราะว่า การทำงานของ GPU มันไม่มีประสิทธิภาพในกรณีโค้ด Vectorization มี Branch มากๆ
ขณะที่ SIMD Extension บน CPU พวกนี้มันทำงานร่วมกับโค้ดที่มี Branch ได้ดีกว่าเพราะ Pipeline ของ CPU มันถูก Optimize มา ให้มี Branch Prediction , Out-of-Order Execution ฯลฯ ในขณะที่ Pipeline ของ GPU ไม่มีฟีเจอร์พวกนี้

อ่านผ่านๆนึกว่าพูดถึงลูกอีลอน

i3i4i5 Tue, 14/07/2020 - 21:46

อ่านผ่านๆนึกว่าพูดถึงลูกอีลอนมัก ถึงกลับต้องวกอ่านใหม่อีกรอบ

555555555555555555

mheevariety Wed, 15/07/2020 - 01:56

555555555555555555

บทใหญ่ใจความ

foizy Tue, 14/07/2020 - 21:54

บทใหญ่ใจความ ถ้าไม่แรงขนาดนี้ก็ถูกนะ
คือมัน Too Specialized แล้วแชร์สัดส่วน Die เยอะ แล้วมันคือ Cost ที่คนส่วนใหญ่ที่ต้องใช้ ร่วมหารด้วยโดยปริยาย
จ่าย Specific สัดส่วน 30% ของราคา สำหรับงานที่จำกัดมากๆๆ

ถ้าจะมีมันก็ควรไปอยู่เฉพาะงานแบบเฉพาะงานสุดๆไปเลย แต่ Dev ก็ต้องแตก Branch ไปเขียนโค้ดสองทางอีก
คือถ้าพูดในแบบ generic ก็เห็นด้วยกับแกแหล่ะ

-------- ชุดคำสั่งแนวนี้ ในทางปฏิบัติ บ.ที่จะใช้มัน ควรรวมกันเป็นกลุ่มก้อน แล้วออก standard/instruction ออกมาเป็น add on สำหรับการใช้งานซะมากกว่า