11 Comments

Revisiting this because it's incredible. Good work!

Expand full comment
author

Thanks, Logan!

Expand full comment
Mar 6Liked by Abhinav Upadhyay

Great post, thank you very much!

Expand full comment
Mar 2Liked by Abhinav Upadhyay

Great post Abhinav. Learnt a lot. I wonder if you also cover or point me in the direction to understand how a tensor operation would become different at the library level from application point of view . For eg what changes (if any) needs to be done in either pytorch or the likes to better conquer this massive parallelisms offered by TSUs or is this completely taken by the Groq’s compiler behind the scenes . I understand Groqs hasn’t published anything yet but if you came across any nuggets on your research pls do share!

Expand full comment
author

Not a lot of details on it. But looks like their compiler can take a pytorch Or tensorflow model and compiler for their hardware. But the groq twitter account also hints that sometimes they have to rewrite the code. So it's not quite clear in what situations the compiler works without any manual intervention.

I'm just guessing here.

Expand full comment
Mar 2Liked by Abhinav Upadhyay

I'm so impressed with Groq, great to know more about the technical details.

Expand full comment
Mar 2Liked by Abhinav Upadhyay

Fantastic post Abhinav! Tremendous research and really topical too.

Expand full comment
author

Thanks, Babbage :)

Expand full comment
Mar 2Liked by Abhinav Upadhyay

Thanks for the post, Abhinav! This was really insightful and I learned a ton.

Expand full comment
author

Thanks Sridaran!

Expand full comment

Thanks for the post, I wonder whether Groq used any parts of Risc-V, or TSP just works like other ISAs (x86/ARM/Risc-V)

Expand full comment