Prometheus unbound: Server that can break memory wall

Majestic Labs has emerged as one of the most intriguing and mysterious tech startups operating in Israel. The company was founded by a senior team of former employees in the engineering departments of Google and Meta and raised $100 million in November 2025, promising to develop an AI server that will bypass Nvidia efficiently for every AI processing operation (price per token).

Majestic Labs cofounder and CEO Ofer Shacham, who founded Meta's chip lab at the request of Mark Zuckerberg, has now revealed how the company will compete with Nvidia. In his first interview in the Israeli media, Shacham explains what is behind the server that wants to solve one of the biggest bottlenecks in the AI servers of companies like Nvidia and AMD - a severe lack of memory.

Ofer Shacham and his US cofounders - Sha Rabii and Masumi Reynders. - have not been satisfied with developing a new graphics processor, but rather with engineering a completely new server called Prometheus on the assumption that releasing the bottleneck in AI processing begins there.

Each server of the Israeli company is equipped with memory chips that provide an amount of memory 100 times higher than a standard Nvidia server in which Blackwell (B200) processors are embedded. Due to a structure that allows extended access to memory for each processor, the amount of memory available to Majestic Labs’ processors is 128 terabytes of RAM. But the server's architecture - the way the main memory components are wired to the various chips - actually provides a thousand times the amount of memory available to an average Nvidia Blackwell processor, which has an estimated 192 gigabytes.

Shacham does not disclose what type of memory is involved but admits that it is not the type of memory commonly used in Nvidia processors - high-bandwidth memory (HBM). According to estimates, the company purchases the memory components from three major memory chip companies: Micron, Samsung and SK Hynix. Instead of Nvidia's GPU processors, Majestic Labs offers its own processors called Ignite, or AIU for short. These were developed based on ARM intellectual property, an original Majestic design and the open source RISC-V platform that allows the calculation of AI operations in a way that is tailored to the requirements of different companies.

Since Nvidia's control is reflected not only in the supply of graphics processors, but also in its control of the operating system for AI "CUDA", the AIU processors were developed to allow AI programmers to develop applications in languages such as PyTorch, the development language accepted by AI experts in the Nvidia environment, and on OpenAI's Triton operating system, which has become CUDA's major competitor, although Nvidia is one of OpenAI's biggest investors recently.

The architectural problem and the difference from Nvidia

The new architecture of Majestic Labs, according to Shacham, eliminates the need for major equipment with communication processors - a role that is played by processors from what was formerly Mellanox, in Nvidia servers.

Shacham says, "We do not need communication between the processors because the communication is done through memory, just as it happens in standard computing like communication between cores in a regular core processor. The need for so many communication chips in data centers started because of the small amount of memory given to each Nvidia processor. It compels the processors to communicate with each other.

"That's why, in order to provide a single stable server, Nvidia had to build an expensive hybrid machine - a 72-processor server rack with high-bandwidth communication between each other (Nvidia's premium server called NVLink) - but because of the inefficiency of their operation, they very quickly reach the maximum memory they can process and wait for data to reach them, while they stand idle. With the increasing number of parameters of the edge models released by companies like Anthropic, OpenAI and Gemini, companies are forced to purchase more and more servers to enable this enormous processing power.

"Edge models, such as those of Gemini and GPT, have difficulty operating with the memory of even 10 graphics processors, so they require entire server racks of 72 Nvidia processors that are seen as able to handle the largest models.

"Jensen Huang, however, recently showed a presentation that even in such a cabinet, the models are already struggling to pass and that already at 400 thousand tokens (the basic AI processing unit) there is a decline, not to mention models of 5 trillion parameters that will be launched towards the end of this year or early next year.

"The need to embed so many GPUs to handle these models dictates an energy consumption that increases exponentially and is unsustainable. Then you reach the memory ceiling, what is known in the industry as the "memory wall", and the processors sit idle half the time, burning energy and waiting for data to flow to them. The result is a small return and increasing energy consumption for each additional GPU chip - it's not just a lack of memory - it's simply an architectural problem in the model in which AI works today."

This problem results in enormous spending by the five biggest cloud giants along - Amazon, Google, Microsoft, Meta and Tesla amounting to $443 billion last year and an expected $602 billion this year.

This means that you will market servers that will be cheaper than Nvidia's?

"We don't compete on price per processing unit, but on price per result - the cost per token. We offer a machine that is capable of producing between 10 and 50 times more tokens per megawatt for every dollar invested in building a data center.

"I have a client who is currently building a data center with an electrical capacity of 500 megawatts. What he asks me is not necessarily how much the data center will cost him, but how many tokens he will sell per megawatt, and I know how to give him up to 50 times what is accepted in the market. Offering a cheaper product in this case is not necessarily a sustainable model for us, that's how you get a 'race to the bottom'. I don't have a cost advantage over Nvidia because there is also a game of quantities here and they can provide large volumes at a low cost and provide discounts. We want to sell our product at a good profit."

The finished product will reach customers next year

Shacham says Majestic's servers and chips are built primarily for inference and agent operation rather than model training, although they could also be adapted to them. The company focuses on language models and graph and table-based neural networks and less on image and video models.

When will you start selling the servers and processors?

"We are already working with several customers in the prototype phase, but the finished product will be shipped to our first customers next year. We are already taking orders and working with several customers to better adapt our product to their needs."

A company that has proven to customers, likely cloud giants, that it can increase processing efficiency of AI models by 50 times - this sounds like a very attractive asset for companies like Nvidia or one of the cloud giants. Have you received any acquisition offers?

"I was asked recently: are we building a product or is the company the product? The answer is that we are building a product. I've built products like that for Google, for Meta, the first processors for the US Defense Advanced Research Projects Agency (DARPA). We're here to build a product that can solve problems for entire industries.

"Of course, things can happen along the way but our aim is to build a product that our customers - companies that build data centers for AI processing - enjoy, that improves their energy consumption, and as a result saves a lot of energy costs that are still planned in the long term. We had to leave our previous jobs at the tech giants to understand what the bottlenecks were, and we’ve got there much faster than we thought. When we founded the company two and a half years ago, we said that the memory problem would be the biggest headache for the industry, today it is ten times worse than we thought it would be at this stage."

Published by Globes, Israel business news - en.globes.co.il - on April 29, 2026.

Prometheus unbound: Server that can break memory wall

Majestic Labs CEO Ofer Shacham tells “Globes” that the company’s newly unveiled AI server Prometheus is 50 times more powerful than rival options.