Vsora Jotunn-8 5nm European inference chip

123 points by rdg42 12 hours ago

ano-ther 4 hours ago

I don’t get the negativity.

The specs look impressive. It is always good to have competition.

They announced tapeout in October with planned dev boards next year. Vaporware is when things don’t appear, not when they are on their way (it takes some time for hardware).

It’s also strategically important for Europe to have its own supply. The current and last US administration have both threatened to limit supply of AI chips to European countries, and China would do the same (as they have shown with Nexperia).

And of course you need the software stack with it. They will have thought of that.

https://vsora.com/vsora-announces-tape-out-of-game-changing-...

impossiblefork 2 hours ago

It's not just competition.
These kinds of things-- cheaper-than-NVIDIA cards that can produce a lot of tokens or run large models cheaply are absolutely necessary to scale text models economically.
Without things like these-- those Euclyd things, those Groq things, etc. no one will be able to offer up big models at prices where people will actually use them, so lack of things like this actually cripples training of big models too.
If the price/token graph is right, this would mean 2.5x more tokens, which presumably means actually using multiple prompts to refine something before producing the output, or to otherwise produce really long non-output sequences during the preparation the output. This also fits really well with the Chinese progress in LLM RL for maths. I suspect all that stuff is totally general and can be applied to non-maths things too.
jack_tripper 2 hours ago

>I don’t get the negativity.
Where do you see the negativity?
I don't believe labeling healthy skepticism and criticism as negativity to farm artificial sympathy in retaliation, does any good to anyone.
Humans have pattern recognition capabilities for a reason, and if a company is triggering that in them, then it's best expressed why(probably because they saw this MO before and got burned) instead of just cheerleading the unknown for fake positivity.
NaomiLehman 4 hours ago

im guessing the negativity is caused by bad branding
vlorr 3 hours ago

[dead]

leo_e 7 hours ago

Impressive numbers on paper, but looking at their site, this feels dangerously close to vaporware.

The bottleneck for inference right now isn't just raw FLOPS or even memory bandwidth—it's the compiler stack. The graveyard of AI hardware startups is filled with chips that beat NVIDIA on specs but couldn't run a standard PyTorch graph without segfaulting or requiring six months of manual kernel tuning.

Until I see a dev board and a working graph compiler that accepts ONNX out of the box, this is just a very expensive CGI render.

mg 6 hours ago

Six months of one developer tuning the kernel?
That seems like not much compared to the hundreds of billions of dollars US companies currently invest into their AI stack? OpenAI pays thousands of engineers and researchers full time.
- SilverBirch 26 minutes ago
  
  It is. The problem is latency. All these fields are moving very fast, and so it doesn't sound bad spending 6 months tuning something, but in reality what is happening is that during those 6 months the guy who built the thing you're tuning has iterated 5 more times and what you started on 6 months ago is now much much better than what you got handed 6 months ago whilst simultaneously being much worse than what that person has in their hands today. If the field you're working in is relatively static, or your performance gap is large enough it makes sense. But in most fields the performance gap is large in absolutely terms but small in temporal terms. You could make something run 10x faster, but you can't build something that will run faster than what will be state of the art in 2 months.
- NaomiLehman 4 hours ago
  
  more like 100 developers for 2 years
  
  nrhrjrjrjtntbt an hour ago
  
  its the new "...and tell me if the picture has a bird"
vlovich123 2 hours ago

Inference accelerators are not where Nvidia is maintaining their dominance afaik.
IshKebab an hour ago

This 100x. I used to work for one of those startups. You need something crazy like a 10x performance advantage to get people to switch from Nvidia to some here-today-gone-tomorrow startup with a custom compiler framework that requires field engineer support to get anything to run.
The outcome is that most of custom chips end up not being sold on the open market; instead their manufacturers run them themselves and sell LLM-as-a-service. E.g. Cerebras, Samba Nova, and you could count Google's TPUs there too.
m00dy 5 hours ago

very good point leo_e
indeed no mention of PyTorch in their website...honestly it looks a bit scammy as well

cardameu 4 hours ago

I can ensure you it's not vaporware at all. silicon is running in the fab, application boards have finished the design phase, software stack validated...

simondotau 3 hours ago

Having a new account promise us it's not vaporware is what I'd expect to see if it was vaporware.

pclmulqdq 12 hours ago

It needs a "buy a card" link and a lot more architectural details. Tenstorrent is selling chips that are pretty weak, but will beat these guys if they don't get serious about sharing.

Edit: It kind of looks like there's no silicon anywhere near production yet. Probably vaporware.

layer8 11 hours ago

Tapeout apparently completed last month, dev boards in early 2026: https://www.eetimes.eu/vsora-tapes-out-ai-inference-chip-for...
embedding-shape 11 hours ago

Nice wave they've been able to ride if it's vaporware, considering they're been at it for five years. Any guesses to why no one else seemingly see the obvious you see?
- pclmulqdq 11 hours ago
  
  Look at the CGI graphics and indications in their published material that all they have is a simulation. A It's all there without disclosing an anticipated release date. Even their product pages and their news page don't seem to have indications of this.
  Also, the 3D graphic of their chip on a circuit board is missing some obvious support pieces, so it's clearly not from a CAD model.
  Lots of chip startups start as this kind of vaporware, but very few of them obfuscate their chip timelines and anticipated release dates this much. 5 years is a bit long to tapeout, but not unreasonable.
  
  embedding-shape 10 hours ago
  
  > Even their product pages and their news page don't seem to have indications of this.
  This seems indicative enough for me, give or take a quarter or two probably, from the latest news post on their website:
  > VSORA is now preparing for full-scale deployment, with development boards, reference designs, and servers expected in early 2026.
  https://vsora.com/vsora-announces-tape-out-of-game-changing-...
  Seems they have partners as well, who describe working together with a Taiwanese company as well.
  You never know, guess they could have gotten others to fall for their illusions too, it's not unheard of. But considering how long time something like this takes to bring to market, that they have dev-boards ready is months rather than years at least gives me enough to wait until then to judge them too harshly.
  
  lukan 9 hours ago
  
  "that they have dev-boards ready is months rather than years at least gives me enough to wait until then to judge them too harshly."
  So far, they just talk about it.

bangaladore 8 hours ago

I love that the JS loads so slow on first load that it just says "The magic number: 0 /tflops"

unwind 3 hours ago

It loaded fine for me, but that slash before the unit was a bit smelly. :| Just a tiny edit, but it's a rather core part of their message so they should probably notice and format it correctly before publishing.
- SiempreViernes an hour ago
  
  I think it could be intended, there is a SI document that says something like "x /unit" is a common way to indicate the unit of a quantity, which a guy I know is using as basis for advocating for that ugly display standard.

randomgermanguy an hour ago

The fact that I have to give them an email for details just feels immediately like a B2B-scam.

Hope they can figure out software, but what im seeing isn't super-promising

Ethan312 7 hours ago

Always good to see more competition in the inference chip space, especially from Europe. The specs look solid, but the real test will be how mature the software stack is and whether teams can get models running without a lot of friction. If they can make that part smooth, it could become a practical option for workloads that want local control.

all2 12 hours ago

288GB RAM on board, and RISC V processors to enable the option for offloading inference from the host machine entirely.

It sounds nice, but how much is it?

rq1 11 hours ago

The next generation will include another processor to offload the inference from the RISC V processors used to offload inference from the host machine.
- ddalex 3 hours ago
  
  The next next generation will include memory to offload memory from the on chip memory to the memory on memory (also known as SRAM cache)

qwertox 2 hours ago

One has got to love the fact hat you only get more information if you submit your email address.

disdi 4 hours ago

Esperanto tried to do the same but went out of business. https://www.esperanto.ai/products/

N_Lens 10 hours ago

An FP8 performance of 3200TFLOPS is impressive, could be used for training as well as inference. "Close to theory efficiency" is a bold statement. Most accelerators achieve 60-80% of theoretical peak; if they're genuinely hitting 90%+, that's impressive. Now let's see the price.

ndom91 3 hours ago

I'll believe it when I see it wishing them the best!

> To streamline development and shorten time-to-market, VSORA embraces industry standards: our toolchain is built on LLVM and supports common frameworks like ONNX and PyTorch, minimizing integration effort and customer cost.

numbers_guy 2 hours ago

Does anyone know why they brand it an "inference chip"? Is it something at the hardware level that makes is unsuitable for training, or is it simply that the toolchain for training is massively more complicated to program?

yaantc 2 hours ago

Very simplified, AI workloads need compute and communications and compute dominates inference, while communications dominate training.
Most start-ups innovate on the compute side, whereas the techno needed for state of the art communications is not common, and very low-level: plenty of analog concerns. The domain is dominated by NVidia and Broadcom today.
This is why digital start-ups tend to focus on inference. They innovate on the pure digital part, which is compute, and tend to use off-the-shelf IPs for communications, so not a differentiator and likely below the leaders.
But in most cases coupling a computation engine marketed for inference with state of the art communications would (in theory) open the way for training too. It's just that doing both together is a very high barrier. It's more practical to start with compute, and if successful there use this to improve the comms part in a second stage. All the more because everyone expects inference to be the biggest market too. So AI start-ups focus on inference first.
IshKebab an hour ago

Probably because their software only supports inference. It's relatively easy to do via ONNX. Training requires an order of magnitude more software work.

postexitus 3 hours ago

Even if it's not vapourware, the website makes it look like one. Just look at those two graphs titled "Jotunn 8 Outperforms the Market" and "More Speed For the Bucks" (!) ; WTH?

thevania 4 hours ago

reminds me of the famous tachyum prodigy vapourware https://www.tachyum.com/

unit149 11 hours ago

[dead]