Today we are going to try a challenging task: deploy the official 671B parameter model of DeepSeek R1 on-premises or on the cloud, which is undoubtedly one of the most powerful open source AI models at present. Although my hardware conditions are far from the requirements, we still have to try whether we can achieve the feat of "a chicken carrying a cannon".
First of all, we must make it clear that many people think that local deployment of DeepSeek R1 is not as good as cloud intelligence, but in fact, it is because they are not downloading the strongest version of the open source model. Common 7B, 8B, 14B, and even 70B parameters are intensive small models expanded from DeepSeek R1 by Qwen and Ollama. The performance is not on the same level as the true 671B original model.
Hardware requirements :
CPU: Recommended 32-core Intel or AMD Snapdragon series.
Memory: at least 512GB.
GPU: Minimum 4 A100, 80GB video memory.
Storage: at least 2TB.
Network: 10GB width.
Since my computer is configured with 64GB of memory, Windows 11 Professional Edition first tried to use virtual memory to make up for the insufficient memory.
The steps are as follows :
Install Ollama - a free open source program that supports model deployment. Download and install the Windows version of Ollama.
Download the Model - Use Ollama to download the DeepSeek R1 671B model, the process runs for about an hour. Scientific Internet access is required to ensure download speed.
Virtual Memory Adjustment – Due to insufficient memory, I tried to bypass the limit by increasing the virtual memory. In the system settings, I initially set the virtual memory to 350GB, the maximum to 450GB, and restart the computer.
After these adjustments, the model was indeed downloaded, but the runtime memory was still insufficient, which showed that I was still missing more than 400 GB of memory.
Next, we try to deploy it in the cloud and use professional cloud GPU services. Many NVIDIA H100 or A100 graphics cards have been selected, but the configuration has been sold due to high demand. Finally, I chose the 8 H100 graphics card package in Seattle. Although the hard disk space is slightly small, the memory and graphics memory are enough to support the model operation.
Cloud deployment command :
Install Ollama.
Download the DeepSeek R1 671B model.
Use Docker to install Open WebUI to call models.
In the cloud, the model can run smoothly, but calling the API requires a real-name system, which discourages many people.
Through this attempt, we successfully deployed the DeepSeek R1 671B model on-premises and cloud, despite the extremely slow local operation, proving that innovation and attempts are still possible even if the hardware is not enough, but efficiency and performance will be greatly reduced.
Although cloud deployment is a more realistic option and more costly for those who need high performance.