Settings

Theme

Intel Announces Arc Pro B70 and Arc Pro B65 GPUs

techpowerup.com

158 points by throwaway270925 a month ago · 114 comments

Reader

genpfault a month ago

600 GB/s of memory bandwidth isn't anything to sneeze at.

~$1000 for the Pro B70, if Microcenter is to be believed:

https://www.microcenter.com/product/709007/intel-arc-pro-b70...

https://www.microcenter.com/product/708790/asrock-intel-arc-...

  • hedgehog a month ago

    Recent kernels have SR-IOV support for these chips too. B&H has them listed for $950.

    https://www.bhphotovideo.com/c/product/1959142-REG/intel_33p...

    When 32GB NVIDIA cards seem to start at around $4000 that's a big enough gap to be motivating for a bunch of applications.

    • robotnikman a month ago

      I'm probably going to snag one of the Intel cards just for the SR-IOV and use with VM's

      • scrubs a month ago

        I tried to use SRIOV to virtualize mellanox nics with vlans on redhat Linux. Long story short it did not work. Per Nvidia the os has to also run open switch. This work was on an already complex setup in finance ... so adding open switch was considered too much additionally complexity. This requirement is not something I run across in the docs.

        Anybody know better?

        • hedgehog a month ago

          The situation in networking is a lot different than graphics. I don't know much other than that it depends on what specific protocol, card, firmware, and network topology you're using and there's not really generic advice. If the question is setting up Ethernet switching inside the card so VFs can talk to the network, then I think the Linux switchdev tools can configure that on their own without Open vSwitch but you probably need to find someone who understands your specific type of deployment for better advice.

      • hedgehog a month ago

        Depending what you're doing AMD's support for VirtIO Native Context might be a useful alternative (I think it gives less isolation which could be good or bad depending on use).

  • jauntywundrkind a month ago

    I tend to agree that the vram size and bandwidth is the core thing, but this B70 Pro allegedly has 387 int8 tops vs a 5090 having 3400 int8 tops. 600 compares vs 1792GB/s. I'm delighted so see an option with quarter the price! But man, a tenth the performance? https://www.techpowerup.com/347721/sparkle-announces-intel-a... https://www.tomshardware.com/pc-components/gpus/nvidia-annou...

    • ColonelPhantom a month ago

      838 seems to be the real INT8 TOPS number for the 5090; going from 800 to 3400 takes an x2 speedup for sparsity (so skipping ops) and another x2 speedup for FP4 over INT8.

      So it's closer to half the speed than a tenth. Intel also seems to be positioning this card against the RTX PRO 4000 Blackwell, not the 5090, and that one gets more like 300 INT8 TOPS. It also has less memory but at a slightly higher bandwidth. The 5090 is much faster and IIRC priced similarly to the PRO 4000, but is also decidedly a consumer product which, especially for Nvidia, comes with limitations (e.g. no server-friendly form factor cards available, and there are or used to be driver license restrictions that prevented using a consumer card in a data center setup).

      • jauntywundrkind a month ago

        Thank you for the correction. That seemed way too lopsided to be believed. This assessment balances the memory to tops ratio much much more evenly, which is to be expected! I was low key hoping someone would help me make sense of how wildly disparate figures were, but I wasn't seeing.

        AMD R9700 is 378/766 tops int8 dense/sparse. 644GB/s of 32GB memory. ~$1400. To throw one more card into the mix. Intel undercutting that nicely here.

        You're right that for companies, the pro grade matters. For us mere mortals, much less so. Features like sr-iov however are just fantastic so see! Good job Intel. AMD has been trickling out such capabilities for a decade (cards fused for "MxGPU" capability) & it makes it such an easier buy to just offer it straight up across the models.

    • adgjlsfhk1 a month ago

      especially for exploratory work 1/10th the perf is fine. Intel isn't able to compete head to head with Nvidia (yet), but vram is capability while speed is capacity. There will be plenty of use cases where the value prop here makes sense.

    • wmf a month ago

      It's more like a 70 class card with extra VRAM.

  • qingcharles a month ago

    I think the B65 is priced at $650. Both supported by llamacpp I believe. With that power draw you could run two of them.

  • giancarlostoro a month ago

    Intel GPU prices have stayed fine, but I do wonder if they are viable for Inference if they will wind up like Nvidia GPUs, severely overpriced.

  • cmovq a month ago

    I mean it kind of is considering that's comparable to a 5070 which has 672 GB/s? Benefit of NVIDIA being the only one using GDDR7 for now I guess.

  • varispeed a month ago

    The product would be excellent in 2024, but now it's a landfill filler. You can run some small models at pedestrian speed, novelty wears off and that's it.

    Intel is not looking in the future. If they released Arc Pro B70 with 512GB base RAM, now that could be interesting.

    32GB? Meh.

kadoban a month ago

The last go around they looked good on paper and then Intel just didn't make any of them to sell.

Announce all you want, if you don't ever ship anything I could buy, who gives a shit.

  • cmxch a month ago

    The B60 (and the dual edition) were an entire exercise in how NOT to launch a product.

    They let people have the B50 but only released the B60 late in the cycle.

    • kadoban a month ago

      I wasn't even aware they ever _really_ released the B60. When I got bored of paying attention it was ~months after "release" and they just didn't exist to buy. I do technically see them on ebay, so yeah apparently they're out there.

      • cmxch a month ago

        The B60 was released but strictly on a B2B basis until a couple of months ago. The B60 dual, a much rarer bird, was scalped heavily enough to be unobtainable.

whalesalad a month ago

Anyone running an ARC card for desktop Linux who can comment on the experience? I've had smooth sailing with AMD GPU's but have never tried Intel.

  • oakpond a month ago

    Running dual Pro B60 on Debian stable mostly for AI coding.

    I was initially confused what packages were needed (backports kernel + ubuntu kobuk team ppa worksforme). After getting that right I'm now running vllm mostly without issues (though I don't run it 24/7).

    At first had major issues with model quality but the vllm xpu guys fixed it fast.

    Software capability not as good as nvidia yet (i.e. no fp8 kv cache support last I checked) but with this price difference I don't care. I can basically run a small fp8 local model with almost 100k token context and that's what I wanted.

    • lostmsu a month ago

      > small fp8 local model with almost 100k token context

      Would not fit Qwen3.5 27B would it? That's the SOTA

      • oakpond a month ago

        This is a fp16 model. That's 54G in weights. I can load it only with fp8 quantization enabled (>= 128k context). I run into this error during generation though: https://github.com/vllm-project/vllm/issues/36350. Looks like an issue with the flash attention backend. But yeah, if you are OK with fp8 quantization on this model, it fits. I expect with 64G VRAM it will fit without quantization

  • wyre a month ago

    There was the video a little while back where LTT built a computer for Linus Torvalds and they put an Intel Arc card inside, so I'd imagine Linux support is at the very least, acceptable.

    [1] https://www.youtube.com/watch?v=mfv0V1SxbNA

    • toofy a month ago

      > they put an Intel Arc card inside

      just add a little bit:

      linus requested the card be intel as well.

  • robertVance a month ago

    Ive ran arc on fedora for years and for general desktop use it’s been perfect. For llm’s/coding it’s getting better but it’s rough around the edges. Had a bug where trying to get vram usage through pytorch would crash the system, ect.

  • Levitating a month ago

    Afaik driver support is very complete on Linux. You often see Arc GPUs used in media transcoding workloads for that reason.

    • HerbManic a month ago

      We can all agree that Intel absolutely nailed it with the media encoding on these things. A nice to have for many, vital for others.

      • whalesalad a month ago

        quicksync has been around for ages its surprising to me that other platorms have not adopted this. no reason a modern cpu can't transcode video.

        • vel0city a month ago

          Quicksync doesn't do its work on the CPU, it does the work on the integrated GPU. Their processors that did not have on-board graphics did not have Quicksync support. See their P series and many of their Xeon parts which do not carry Quicksync support, while the versions with integrated graphics do have it.

          AMD chips that have integrated GPUs (their APU series of chips) often do have support for hardware video encoders. Because, once again, its a function of the GPU and not the CPU.

  • himata4113 a month ago

    Linus Torvalds runs ARC :)

  • bpye a month ago

    My B580 works fine on Linux. Graphics perf is a bit worse than under Windows, but supposedly compute is pretty much the same.

    • BizarroLand a month ago

      I'm using a B580 for a windows 10 media pc and it's fine even for moderate gaming when I drop down to 1080p on my 4k tv, although I did notice a little stuttering from time to time.

      To be fair, that might be due to still running Windows 10 or due to not having reset the PC in 4 years. It's going to be moved over to Linux soon, I'm just being lazy.

  • unethical_ban a month ago

    I'm running A-series Arc for media transcoding and it works just fine.

tbyehl a month ago

Where's the A310 / A40 successor? Gimme some SR-IOV in a slot-powered, single-width, low-profile card.

thefounder a month ago

Why don’t they make an GPU optimised for inference/batch jobs with 1 TB of ram ? Everyone wants to run the biggest models locally.

  • esperent a month ago

    I'm not sure it's really possible.

    Take a look at the die shot of a 5090:

    http://dieshot.com/wp-content/uploads/2025/03/Dieshot-GB202-...

    It has 32gb of RAM and memory controllers are about 10% of the the total area. What would you have to do for 1024gb of RAM?

    Not to mention the price would be astronomical.

    • thefounder a month ago

      How is Apple packing 512gb of ram on their cpu?

      • whaleofatw2022 25 days ago

        IIRC Apple is using the lower channel width options in LPDDR5.

        I.e. instead of 64 bit channels they do 16 (or maybe 32) bit. That lowers the die area needed on the chip for memory controllers.

        But it also impacts bandwidth, AFAIK an M4 ultra is still on the order of 1/4 the bandwidth of something like a 5090

jmward01 a month ago

I think this shows a shift in model architecture. MOE and similar need more memory for the compute available than just one big model with a lot of layers and weights. I think this is likely a trend that will accelerate. You build the trade-off in which encourages even more experts which means more of a tradeoff, so more experts.....

  • zozbot234 a month ago

    Most people doing local inference run the MoE layers on CPU anyway, because decode is not compute constrained and wasting the high-bandwidth VRAM on unused weights is silly. It's better to use it for longer context. Recent architectures even offload the MoE experts to fast (PCIe x4 5.0 or similar performance) NVMe: it's slow but it opens up running even SOTA local MoE models on ordinary hardware.

    • jmward01 a month ago

      I think you are making my point. Having a little slower, but a lot more, memory on the card would speed this use-case up a lot and remove the need to go to system memory or make it available for very rarely used experts allowing for even larger MOE models running with good performance.

      • zozbot234 a month ago

        I think speeding up long context and opening up the use of models with larger shared layers is ultimately more relevant than hosting unused MoE layers. Of course you could do that as a last resort, i.e. when running with a smaller context that leaves some VRAM free to use.

        • jmward01 a month ago

          Long context will be solved and capped and turned into a theta 1 operation or, at worst, theta log(n). People don't have infinite perfect recall so agents don't need it. Also, there are really good solutions to it that just aren't explored enough right now since transformer architectures are where everyone is dumping money and time. I suspect very soon somone will have a much better system that just takes over and then the idea of context limits will be a thing of the past. I've actually built something myself that allows infinite context/perfect recall in theta 1 (minor asterisk here as there has to be but meh). I know others have solutions too.

          • zozbot234 a month ago

            There's already models with capped long context but if you make that the whole model it makes needle-in-haystack search impossible and that's actually a very common operation. Which is why Qwen 3.5 only makes a portion of it capped, and AIUI the new Nemotron models are broadly similar.

          • arw0n a month ago

            See also the new Deepseek paper on engram transformers for some progress in this area: https://arxiv.org/pdf/2601.07372v1

            They observe significant gains in factual knowledge retrieval capabilities, but reasoning barely moves the needle.

pjmlp a month ago

New cards in 2026, and targeting Vulkan 1.3?!

cmxch a month ago

Good to see that Intel learned to release product to more than just resellers.

Now can we have a 64gb B70 that’s worldwide available and not marked to unicorns like the Maxsun B60 Dual model has been?

SkyeCA a month ago

32GB of vram for a decent price? I wonder if these will work well for VR, because vram is my current main issue.

  • aruametello a month ago

    (VR enthusiast here, mostly under windows)

    intel support has been mild to non existent in the VR space unfortunately. Given the very finicky latency + engine support i wouldn’t bet on a great experience, but hope for the best for more competition in this market. (even amd has a lot of caveats comparing to nvidia)

    Footnotes:

    * critical "as low as it can be" low latency support on intel XE is still not as mature as nvidia, amd was lagging behind until recently.

    * Not sure about "multiprojection" rendering support on intel, lack of support can kill vr performance or make it incompatible. (the optimized vr games often rely on it)

    • HerbManic a month ago

      It looked like when Intel jumped into this space, they tried to do everything at once. It didnt work well, they were playing catch up to some very mature systems. They are now being much more selective and restrained. The down side is that things like VR support are put on the back burner for years.

      Good for most people but if you need that fuctiobality and they dont have it, go somewhere else.

nickthegreek a month ago

Both have 32gb vram. Could be a pretty compelling choice.

  • cptskippy a month ago

    They certainly look viable as replacements for my Tesla P40 for virtual workloads.

SmellTheGlove a month ago

Any idea if it'll be possible to mix these with nvidia cards? Adding 32GB to a single 3090 setup would be pretty nice.

lostmsu a month ago

Nothing like Crossfire/SLI? Not possible to efficiently connect multiple cards for one large model?

mikelitoris a month ago

Too little too late, classic Intel

DiabloD3 a month ago

Since they fired the entire Arc team and a lot of the senior engineers already updated their Linkedins to reflect their new positions at AMD, Nvidia, and others, as well as laying off most of their Linux driver team (GPU and non-GPU), uh...

WTF?

  • staticman2 a month ago

    You are exaggerating, right? They didn't really fire the entire Arc team did they? I couldn't find a source saying that.

    • DiabloD3 a month ago

      Nope, no exaggeration.

      The news that Celestial is basically canceled already hit the HN front page, as well as Druid has been canceled before tapeout.

      Celestial will only be issued in the variant that comes in budget/industrial embedded Intel platforms that have a combined IO+GPU tile, but the performance big boy desktop/laptop parts that have a dedicated graphics tile will ship an Nvidia-produced tile.

      There will be no Celestial DGPU variant, nor dedicated tile variant. Drivers will be ceasing support for DGPUs of all flavors, and no new bug fixes will happen for B series GPUs (as there is no B series IGPUs; A series IGPUs will remain unaffected).

      They signed the deal like 2-3 months ago to cancel GPUs in favor of Nvidia. The other end of this deal is the Nvidia SBCs in the future will be shipping as big-boy variants with Xeon CPUs, Rubin (replacing Blackwell) for the GPU, Vera (replacing Grace) for the on-SBC GPU babysitter, and newest gen Xeons to do the non-inference tasks that Grace can't handle.

      There is also talk that this deal may lead to Nvidia moving to Intel Foundry, away from TSMC. There is also talk that Nvidia may just buy Intel entirely.

      For further information, see Moore's Law Is Dead's coverage off and on over the past year.

      • chao- a month ago

        You may be a bit too credulous. There has been a "leak" or "rumor" that Intel's GPU initiatives are canceled about once every three months, for over two years. Yet Intel continues to release new SKUs and make new product announcements. Just last month they announced a new data center GPU product (an inference-focused variant of Jaguar Shores).

        I can't see the future, but I can see patterns: the media that reports straight from the industry rumor mill LOVES this "Intel has cancelled its GPUs" story, for whatever reason. I have no particular love for Intel (out of my six current systems, my only Intel box is a cheap NUC from 2018), but at this point, these rumors echo the old joke about economists who "accurately predicted the last nine out of two recessions".

      • gk-- a month ago

        ah, so this is MLID. yeah i'll wait for the announcement.

      • mtlmtlmtlmtl a month ago

        MLID has been saying Arc was cancelled since before the first Alchemist cards were released.

      • PowerElectronix a month ago

        MLID is a terrible information source.

      • thesmart a month ago

        The idea that Intel's foundry could replace TSMC is hilarious. No. Maybe a gamer-focused mid-market card based on 30-series.

  • wtallis a month ago

    This is a chip they've had lying around for a while. It's the same architecture as used in the Arc B580 that launched at the end of 2024; this is just a slightly larger sibling. Intel clearly knew that their larger part wouldn't make for a competitive gaming GPU (hence the lack of a consumer counterpart to these cards), but must have decided that a relatively cheap workstation card with 32GB might be able to make some money.

    • throwaway85825 a month ago

      Now if they launched the 32GB workstation card in 2024 with cheap RAM it would have been a success.

    • DiabloD3 a month ago

      Still seems crooked to sell a GPU that is already lost their driver team and will get no new meaningful updates.

      • wtallis a month ago

        Does it need a huge driver team pushing out big updates in order to be suitable for the kind of Pro use cases it's targeted at? They're explicitly not going after the gaming market so they don't need to be on the treadmill of constant driver updates delivering workarounds and optimizations for the latest game releases.

        They're still going to be employing some developers for driver maintenance for the sake of their iGPUs, and that might be enough for these cards.

  • unethical_ban a month ago

    I didn't know this. Have they officially given up on building discrete GPUs? Is this a last gasp of Arc to offload decent remaining architectures at a lower price than nvidia?

    It is crazy to me that a world newly craving GPU architecture for AI, and gamers being largely neglected, that Intel would abandon an established product line.

    • StilesCrisis a month ago

      It does sound like a very Intel choice though.

    • mschuster91 a month ago

      > It is crazy to me that a world newly craving GPU architecture for AI, and gamers being largely neglected, that Intel would abandon an established product line.

      You still need to fab it somewhere. Intel's fabs have been plagued with issues for years, the AI grifters have bought up a lot of TSMCs allotments and what remains got bought up by Apple for their iOS and macOS lineups, and Samsung's fabs are busy doing Samsung SoCs.

      And that unfortunately may explain why Intel yanked everything. What use is a product line that can't be sold because you can't get it produced?

      Yet another item on my long list of "why I want to see the AI grift industry burn and the major participants rotting in a prison cell".

WarmWash a month ago

Wake me when they wake up and release a middling card with 128GB memory.

  • wmf a month ago
  • zozbot234 a month ago

    Buy Strix Halo or Apple Silicon platforms and you get essentially that.

  • Weryj a month ago

    Buy 4?

    • electronsoup a month ago

      Which mainboards are cheap and have 4 pcie16x (electrical) slots, that don't need weird risers to fit 4 GPUs

      • SmellTheGlove a month ago

        Consumer CPUs don't have enough PCIE lanes to do that. Even if they had physical x16 slots, at most two of them would be x16.

        What's cheap to you? You can find Epyc 7002/7003 boards on ebay in the $400 range and those will do it. That's probably the best deal for 4x PCIE 4.0 x16 and DDR4. Probably $500 range with a CPU. That's in the ballpark of a mid to high end consumer setup these days.

        • Weryj a month ago

          That's the path I'm taking. All with all those PCI-e lanes available.

          If I had more income, I would also buy 4x 96g Optane drives of p0 swap disks and a few ssd's for p1 swap disks. To evaluate how well you can get a 1T model running in these absurd ram prices.

      • irishcoffee a month ago

        If your actual gripe is risers, sounds like a "you" problem, not a technical problem.

        • MrDrMcCoy a month ago

          Even if you're fine with risers, that might not be enough. If the bridge lanes are PCIe Gen 3, as many consumer boards have, your Gen 5 card might not init. I extensively tested several motherboards to try and get my AM5 CPU talking to a triple Radeon AI Pro 9700 XT setup, and they absolutely refuse to come up on PCIe3. I was using dummy EDID plugs for them, so they think they have a display, ruling out that issue.

          What I eventually had to do was buy a used Threadripper box to run those cards, because PCIe Gen 4 definitely works.

    • WarmWash a month ago

      Because I don't want to spend $4k.

      I want to spend $1500 for a card that can run a proper large model, even if it only can do 25 tk/s.

      Intel is squandering a golden opportunity to knee-cap AMD and Nvdia, under the totally delusional pretense that intel enterprise cards still have a fighting chance.

      • ericd a month ago

        I saw a good quote recently, "you're not going to get 128 gigs of vram loose in a plastic bag for that much".

mmwelt a month ago

At the end of the 2nd paragraph:

> Intel will provide certified drivers for Windows 11, Windows 10, and Linux.

Windows 11, OK. Linux, OK. But why Windows 10 for a new product?!

vessenes a month ago

Not sure why you'd want this over an apple setup. M4 max is 545GB/s of memory bandwidth - $2k for an entire Mac Studio with 48GB of RAM vs 32 for the B70.

  • hedgehog a month ago

    Being able to keep infrastructure on Linux is a big advantage.

    • RestartKernel a month ago

      How many compatibility issues is MacOS realistically expected to spur? Windows DX felt unusable to me without a Linux VM (and later WSL), but on MacOS most tooling just kinda seems to work the same.

      • einr a month ago

        It’s not the tooling for me, macOS is just bad as a server OS for many reasons. Weird collisions with desktop security features, aggressive power saving that you have to fight against, root not being allowed to do root stuff, no sane package management, no OOB management, ultra slow OS updates, and generally but most importantly: the UNIX underbelly of macOS has clearly not been a priority for a long time and is rotting with weird inconsistent and undocumented behaviour all over the place.

        • wolfhumble a month ago

          > Weird collisions with desktop security features

          Linux is not immune to BIOS/UEFI firmware attacks either. Secure Boot, TPM, and LUKS can work well together, but you still depend on proprietary firmware that you do not fully control. LogoFAIL is a good example of that risk, especially in an evil maid scenario involving temporary physical access. I think Apple has tighter control over this layer.

      • bigyabai a month ago

        For server usage? macOS is the least-supported OS in terms of filesystems, hardware and software. It uses multiple gigabytes of memory to load unnecessary user runtime dependencies, wastes hard drive space on statically-linked binaries, and regularly breaks package management on system upgrades.

        At a certain point, even WSL becomes a more viable deployment platform.

      • hedgehog a month ago

        Provisioning, remote management, containers, virtualization, networking, graphics (and compute), storage, all very different on Mac. The real question is what you would expect to be the same.

  • protimewaster a month ago

    My thinking is that I'd pick this, because I can't just plug a Mac into a slot in my server and have it easily integrate with all my other hardware across an ultra fast bus.

    If they made an M4 on a card that supported all the same standards and was price competitive, though, that might be a good option.

  • fvv a month ago

    with those $2k you can have 2xB70, with 1.2Tb/sec and 64G Vram, on linux ( and you can scale further while mac prices increase are not linear 0

    • Reubend a month ago

      You're absolutely right. And these Intel GPUs will also be much faster in terms of actual math than the M series GPUs that the Apple setup would have.

  • cptskippy a month ago

    Support for Single Root IO Virtualization (SR-IOV) to enable compute and Graphics workloads in virtualized environments.

  • thesmart a month ago

    Because the B70 cards can pipeline 500 tok/s on concurrent workloads. Apple Silicon and Nvidia consumer cards only work well w/ serial workloads.

  • wyre a month ago

    one can upgrade and swap parts with a computer running an Intel GPU. Linux is very well supported compared to Mac hardware.

  • pjmlp a month ago

    Some folks care about the workstation market, and the flexibility it offers in choice.

  • 2OEH8eoCRo0 a month ago

    Funny, I not sure why anyone would use Apple over Linux.

    • pjmlp a month ago

      Good support on laptops that I can buy at Media Market, FNAC, Cool Blue,....

      Although personally I am more of the Windows/Linux VM workstation laptop kind.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection