Apples architecture comes with its own trade-offs, it gives them huge capacity and pretty good bandwidth, but not nearly as much as Nvidia’s architectures have. The M3 Ultra is 800GB/sec, the RTX 5090 is 1.8TB/sec, and the H200 is 4.8TB/s(!). Huge capacity with middling bandwidth is in vogue because it’s a good fit for AI inference, but AI training and most other applications of GPUs need as much bandwidth as they can get.
As a comment to Nvidia’s RTX Pro 6000 has 96GB of VRAM and 600W of power.