ThomasRaoux
changed the title
Initial support for Nvidia Blackwell GPUs (sm_100).
Add support for Nvidia Blackwell GPUs
Initial support for Nvidia Blackwell GPUs (sm_100). The key contributions included in this PR are: * Support for 5th generation Tensor Core. * Modeling and support of Tensor Memory. * Native support for microscaling formats mxfp4 and mxfp8. * Improvements to the software pipeliner to take advantage of Tensor Cores and Tensor memory This was developed in close collaboration between Nvidia and OpenAI. From Nvidia: dePaul Miller (@depaulmillz) Samantha Hirsch (@Sam3077) Yujia Zhai (@yzhaiustc) Shang Zhang (@shangz-ai) Pradeep Ramani (@IonThruster) Matthew Brookhart (@mbrookhart) Masahiro Masuda (@masahi) Chris Sullivan (@csullivan) Clive Unger (@CliveUnger) Jason Knight (@binarybana) From OpenAI: Pawel Szczerbuk (@pawelszczerbuk) Peter Bell (@peterbell10) Phil Tillet (@ptillet) Jeff Niu (@jeffniu-openai) Thomas Raoux (@ThomasRaoux) --------- Co-authored-by: Baogang Song <baogang@openai.com> Co-authored-by: Pawel Szczerbuk <pawel.szczerbuk@openai.com> Co-authored-by: Sergei Vorobev <xvorsx@gmail.com> Co-authored-by: ionthruster <pradeepramni@gmail.com> Co-authored-by: dePaul Miller <depaulmillz@users.noreply.github.com> Co-authored-by: Matthew Brookhart <mbrookhart@nvidia.com> Co-authored-by: Chris Sullivan <chris@sullivan.ai> Co-authored-by: Masahiro Masuda <mmasuda@nvidia.com> Co-authored-by: Chris Sullivan <chrsullivan@nvidia.com> Co-authored-by: peterbell10 <peterbell10@openai.com> Co-authored-by: jeffniu-openai <jeffniu@openai.com>
AlexAUT pushed a commit to AlexAUT/triton that referenced this pull request
Initial support for Nvidia Blackwell GPUs (sm_100). The key contributions included in this PR are: * Support for 5th generation Tensor Core. * Modeling and support of Tensor Memory. * Native support for microscaling formats mxfp4 and mxfp8. * Improvements to the software pipeliner to take advantage of Tensor Cores and Tensor memory This was developed in close collaboration between Nvidia and OpenAI. From Nvidia: dePaul Miller (@depaulmillz) Samantha Hirsch (@Sam3077) Yujia Zhai (@yzhaiustc) Shang Zhang (@shangz-ai) Pradeep Ramani (@IonThruster) Matthew Brookhart (@mbrookhart) Masahiro Masuda (@masahi) Chris Sullivan (@csullivan) Clive Unger (@CliveUnger) Jason Knight (@binarybana) From OpenAI: Pawel Szczerbuk (@pawelszczerbuk) Peter Bell (@peterbell10) Phil Tillet (@ptillet) Jeff Niu (@jeffniu-openai) Thomas Raoux (@ThomasRaoux) Co-authored-by: Baogang Song <baogang@openai.com> Co-authored-by: Pawel Szczerbuk <pawel.szczerbuk@openai.com> Co-authored-by: Sergei Vorobev <xvorsx@gmail.com> Co-authored-by: ionthruster <pradeepramni@gmail.com> Co-authored-by: dePaul Miller <depaulmillz@users.noreply.github.com> Co-authored-by: Matthew Brookhart <mbrookhart@nvidia.com> Co-authored-by: Chris Sullivan <chris@sullivan.ai> Co-authored-by: Masahiro Masuda <mmasuda@nvidia.com> Co-authored-by: Chris Sullivan <chrsullivan@nvidia.com> Co-authored-by: peterbell10 <peterbell10@openai.com> Co-authored-by: jeffniu-openai <jeffniu@openai.com>
makslevental pushed a commit to makslevental/triton that referenced this pull request
Initial support for Nvidia Blackwell GPUs (sm_100). The key contributions included in this PR are: * Support for 5th generation Tensor Core. * Modeling and support of Tensor Memory. * Native support for microscaling formats mxfp4 and mxfp8. * Improvements to the software pipeliner to take advantage of Tensor Cores and Tensor memory This was developed in close collaboration between Nvidia and OpenAI. From Nvidia: dePaul Miller (@depaulmillz) Samantha Hirsch (@Sam3077) Yujia Zhai (@yzhaiustc) Shang Zhang (@shangz-ai) Pradeep Ramani (@IonThruster) Matthew Brookhart (@mbrookhart) Masahiro Masuda (@masahi) Chris Sullivan (@csullivan) Clive Unger (@CliveUnger) Jason Knight (@binarybana) From OpenAI: Pawel Szczerbuk (@pawelszczerbuk) Peter Bell (@peterbell10) Phil Tillet (@ptillet) Jeff Niu (@jeffniu-openai) Thomas Raoux (@ThomasRaoux) Co-authored-by: Baogang Song <baogang@openai.com> Co-authored-by: Pawel Szczerbuk <pawel.szczerbuk@openai.com> Co-authored-by: Sergei Vorobev <xvorsx@gmail.com> Co-authored-by: ionthruster <pradeepramni@gmail.com> Co-authored-by: dePaul Miller <depaulmillz@users.noreply.github.com> Co-authored-by: Matthew Brookhart <mbrookhart@nvidia.com> Co-authored-by: Chris Sullivan <chris@sullivan.ai> Co-authored-by: Masahiro Masuda <mmasuda@nvidia.com> Co-authored-by: Chris Sullivan <chrsullivan@nvidia.com> Co-authored-by: peterbell10 <peterbell10@openai.com> Co-authored-by: jeffniu-openai <jeffniu@openai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters