Top 5 AI Inference Platforms That Use Full-Blooded DeepSeek-R1 for Free

AI News4mos agoupdate Sharenet.ai
1.1K 0
吐司AI

Due to excessive traffic and a cyber attack, the DeepSeek website and app have been up and down for a few days, and the API is not working.

5大免费使用满血版 DeepSeek-R1 的AI推理平台

We have previously shared the method for deploying DeepSeek-R1 locally (seeDeepSeek-R1 Local Deployment), but the average user is limited to a hardware configuration that makes it difficult to run even a 70b model, let alone a full 671b model.

Luckily, all major platforms have access to DeepSeek-R1, so you can try it as a flat replacement.

 

I. NVIDIA NIM Microservices

NVIDIA Build: Integration of multiple AI models and free experience
Website: https://build.nvidia.com/deepseek-ai/deepseek-r1

NVIDIA deployed the full volume parameter 671B of the DeepSeek-R1 Models, the web version is straightforward to use, and you can see the chat window when you click on it:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Also listed on the right is the code page:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Simply test it:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Below the chat box, you can also turn on some parameter items (which can be defaulted in most cases):

5大免费使用满血版 DeepSeek-R1 的AI推理平台

The approximate meaning and function of these options are listed below:

Temperature:
The higher the value, the more randomized the output is, potentially generating more creative responses

Top P (nuclear sampling):
Higher values retain more probabilistic quality tokens and generate more diversity

Frequency Penalty:
Higher values penalize high-frequency words more and reduce verbosity or repetition

Presence Penalty:
The higher the value, the more inclined the model is to try new words

Max Tokens (maximum generation length):
The higher the value, the longer the potential length of the response

Stop:
Stop output when generating certain characters or sequences, to prevent generating too long or running out of topics.

Currently, due to the increasing number of white johns (look at the number of people in the queue in the chart below), NIM is lagging some of the time:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Is NVIDIA also short of graphics cards?

NIM microservices also support API calls to DeepSeek-R1, but you need to sign up for an account with an email address:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

The registration process is relatively simple, using only email verification:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

After registering, you can click on "Build with this NIM" at the top right of the chat interface to generate an API KEY. Currently, you will get 1000 points (1000 interactions) for registering, so you can use it all up and register again with a new email address.

5大免费使用满血版 DeepSeek-R1 的AI推理平台

The NIM microservices platform also provides access to many other models:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

 

II. Microsoft Azure

Web site:
https://ai.azure.com

Microsoft Azure allows you to create a chatbot and interact with the model through a chat playground.

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Azure is a lot of trouble to sign up for, first you have to create a Microsoft account (just log in if you already have one):

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Creating an account also requires email verification:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Finish by proving you're human by answering 10 consecutive netherworld questions:

5大免费使用满血版 DeepSeek-R1 的AI推理平台5大免费使用满血版 DeepSeek-R1 的AI推理平台

Getting here isn't enough to create a subscription:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Verify information such as cell phone number as well as bank account number:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Next, select "No technical support":

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Here you can start the cloud deployment, in the "Model Catalog" you can see the prominent DeepSeek-R1 model:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

After clicking on it, click on "Deploy" on the next page:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Next, you need to select "Create New Project":

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Then default them all and click "Next":

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Next, click "Create":

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Creating it under this page starts, and it takes a while to wait:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

When you're done, you'll come to this page where you can click "Deploy" to go to the next step:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

You can also check the "Pricing and Terms" above to see that it is free to use:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Continue to this page by clicking on "Deployment" and you can click on "Open in Playground":

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Then the conversation can begin:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Azure also has NIM-like parameter tuning available:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

As a platform, there are many models that can be deployed:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Already deployed models can be quickly accessed in the future via "Playground" or "Model + Endpoint" in the left menu:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

III. Amazon AWS

Web site:
https://aws.amazon.com/cn/blogs/aws/deepseek-r1-models-now-available-on-aws

DeepSeek-R1 is also prominently displayed and lined up.

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Amazon AWS registration process and Microsoft Azure is almost as troublesome, both have to fill in the payment method, but also phone verification + voice verification, here will not describe in detail:

5大免费使用满血版 DeepSeek-R1 的AI推理平台5大免费使用满血版 DeepSeek-R1 的AI推理平台

The exact deployment process is much the same as Microsoft Azure:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

IV. Cerebras

Cerebras: the world's fastest AI inference, high-performance computing platform available today
Website: https://cerebras.ai

Unlike several large platforms, Cerebras uses a 70b model, claiming to be "57 times faster than GPU solutions."

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Once the email registration is entered, the drop-down menu at the top allows you to select DeepSeek-R1:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

The real-world speeds are indeed faster, though not as exaggerated as claimed:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

V. Groq

Groq: AI big model inference acceleration solution provider, high-speed free big model interface
Website: https://groq.com/groqcloud-makes-deepseek-r1-distill-llama-70b-available

5大免费使用满血版 DeepSeek-R1 的AI推理平台

The model is also optional after the email registration is entered:

5大免费使用满血版 DeepSeek-R1 的AI推理平台

It's also fast, but again, 70b feels a little more retarded than the Cerebras?

5大免费使用满血版 DeepSeek-R1 的AI推理平台

Note that the chat interface can be accessed directly while logged in:
https://console.groq.com/playground?model=deepseek-r1-distill-llama-70b

 

Complete list of DeepSeek V3 and R1:

AMD

AMD Instinct™ GPUs Power DeepSeek-V3: Revolutionizing AI Development with SGLang (AMD Instinct™ GPUs Power DeepSeek-V3: Revolutionizing AI Development with SGLang)

NVIDIA

DeepSeek-R1 NVIDIA model card (DeepSeek-R1 NVIDIA model card)

Microsoft Azure

Running DeepSeek-R1 on a single NDv5 MI300X VM (Running DeepSeek-R1 on a single NDv5 MI300X VM)

Baseten

https://www.baseten.co/library/deepseek-v3/

Novita AI

Novita AI uses SGLang running DeepSeek-V3 for OpenRouter (Novita AI using SGLang to run DeepSeek-V3 for OpenRouter)

ByteDance Volcengine

The full-size DeepSeek model lands on the Volcano Engine!

DataCrunch

Deploy DeepSeek-R1 671B on 8x NVIDIA H200 with SGLang (Deployment of DeepSeek-R1 671B on 8x NVIDIA H200 using SGLang)

Hyperbolic

https://x.com/zjasper666/status/1872657228676895185

Vultr

How to Deploy Deepseek V3 Large Language Model (LLM) Using SGLang (How to deploy with SGLang) Deepseek V3 Large Language Model (LLM))

RunPod

What's New for Serverless LLM Usage in RunPod in 2025? (What are the new features used by Serverless LLM in RunPod 2025?)

© Copyright notes
AiPPT

Related articles

No comments

none
No comments...