Deploy DeepSeek R1 671B Model on-premises and in the Cloud

Author: LoRA Time: 27 Feb 2025 1089

rectangle_large_type_2_1d33b4bdd103aff7625bba22cb6a5542.jpg

Today we are going to try a challenging task: deploy the official 671B parameter model of DeepSeek R1 on-premises or on the cloud, which is undoubtedly one of the most powerful open source AI models at present. Although my hardware conditions are far from the requirements, we still have to try whether we can achieve the feat of "a chicken carrying a cannon".

First of all, we must make it clear that many people think that local deployment of DeepSeek R1 is not as good as cloud intelligence, but in fact, it is because they are not downloading the strongest version of the open source model. Common 7B, 8B, 14B, and even 70B parameters are intensive small models expanded from DeepSeek R1 by Qwen and Ollama. The performance is not on the same level as the true 671B original model.

Hardware requirements :

CPU: Recommended 32-core Intel or AMD Snapdragon series.

Memory: at least 512GB.

GPU: Minimum 4 A100, 80GB video memory.

Storage: at least 2TB.

Network: 10GB width.

Since my computer is configured with 64GB of memory, Windows 11 Professional Edition first tried to use virtual memory to make up for the insufficient memory.

The steps are as follows :

Install Ollama - a free open source program that supports model deployment. Download and install the Windows version of Ollama.

Download the Model - Use Ollama to download the DeepSeek R1 671B model, the process runs for about an hour. Scientific Internet access is required to ensure download speed.

Virtual Memory Adjustment – Due to insufficient memory, I tried to bypass the limit by increasing the virtual memory. In the system settings, I initially set the virtual memory to 350GB, the maximum to 450GB, and restart the computer.

After these adjustments, the model was indeed downloaded, but the runtime memory was still insufficient, which showed that I was still missing more than 400 GB of memory.

Next, we try to deploy it in the cloud and use professional cloud GPU services. Many NVIDIA H100 or A100 graphics cards have been selected, but the configuration has been sold due to high demand. Finally, I chose the 8 H100 graphics card package in Seattle. Although the hard disk space is slightly small, the memory and graphics memory are enough to support the model operation.

Cloud deployment command :

Install Ollama.

Download the DeepSeek R1 671B model.

Use Docker to install Open WebUI to call models.

In the cloud, the model can run smoothly, but calling the API requires a real-name system, which discourages many people.

Through this attempt, we successfully deployed the DeepSeek R1 671B model on-premises and cloud, despite the extremely slow local operation, proving that innovation and attempts are still possible even if the hardware is not enough, but efficiency and performance will be greatly reduced.

Although cloud deployment is a more realistic option and more costly for those who need high performance.

Tips & Information

Deepseek Tutorial

Deepseek is an AI data search and analysis tool. This article introduces the functions, applications and usage methods of Deepseek in detail.

Deploy DeepSeek R1 671B Model on-premises and in the Cloud

Google DeepMind releases DolphinGemma model

Tesla announces launch of universal AI fully autonomous driving solution

Hugging Face acquires Pollen Robotics to enter the field of open source robot hardware

GPT-4.1 model unveiled! Cursor and Windsurf help developers encode more efficiently

Deepseek Tutorial