How to Use DeepSeek R1 for Free in Visual Studio Code with Cline or Roo Code

Publish Date: Jan 24

1786 38

If you're looking for an AI that excels in reasoning and is also free because it's open source, the newly launched DeepSeek R1 is a great choice. It competes with and outperforms models like GPT-4, o1-mini, Claude 3.5, among others. I tested it and have nothing but praise!

If you want to run it directly in your Visual Studio Code as a code agent similar to GitHub Copilot, without spending a dime, come along as I show you how to do this using tools like LM Studio, Ollama, and Jan.

Why is DeepSeek R1 so talked about these days?

It's free and open source: Unlike many models that charge a fortune, you can use it without paying anything. It's even available for chat at https://chat.deepseek.com.
Performance: It competes with and outperforms other models in tasks involving logic, mathematics, and even code generation (which is my favorite part).
Multiple versions: To run it locally (LLM), there are models ranging from 1.5B to 70B parameters, so you can choose what works best on your PC depending on your hardware.
Easy to integrate: You can connect it to VSCode using extensions like Cline or Roo Code.
No costs: If you run it locally, you don't pay for tokens or APIs. A graphics card is recommended, as running it solely on the CPU is slower.

Important Tips Before You Start

Save resources: If your PC isn't very powerful, stick with the smaller models (1.5B or 7B parameters) or quantized versions.
RAM Calculator: Use LLM Calc to find out the minimum RAM you'll need.
Privacy: Running it locally means your data stays on your PC and doesn't go to external servers.
No costs: Running it locally is free, but if you want to use the DeepSeek API, you'll need to pay for tokens. The good news is that their price is much lower than competitors.

Which Model to Choose? It Depends on Your PC!

DeepSeek R1 has several versions, and the choice depends on your hardware:

1.5B Parameters:
- RAM required: ~4 GB.
- GPU: Integrated (like NVIDIA GTX 1050) or a modern CPU.
- What for?: Simple tasks and modest PCs.
7B Parameters:
- RAM required: ~8-10 GB.
- GPU: Dedicated (like NVIDIA GTX 1660 or better).
- What for?: Intermediate tasks and PCs with better hardware.
70B Parameters:
- RAM required: ~40 GB.
- GPU: High-end (like NVIDIA RTX 3090 or higher).
- What for?: Complex tasks and super powerful PCs.

How to Run DeepSeek R1 Locally

1. Using LM Studio

Download and install LM Studio: Just go to the LM Studio website and download the version for your system.
Download the DeepSeek R1 model: In LM Studio, go to the Discover tab, search for "DeepSeek R1," and select the version most compatible with your system. If you're using a MacBook with Apple processors, keep the MLX option selected next to the search bar (these versions are optimized for Apple hardware). For Windows or Linux, choose the GGUF option.
Load the model: After downloading, go to Local Models, select DeepSeek R1, and click Load.
Start the local server: In the Developer tab, enable Start Server. It will start running the model at http://localhost:1234.
Proceed to step 4 Integrating with VSCode!

2. Using Ollama

Install Ollama: Download it from the Ollama website and install it.
Download the model: In the terminal, run*:

   ollama pull deepseek-r1

*This is the main model; if you want smaller models, go to https://ollama.com/library/deepseek-r1 and see which command to run in the terminal.

Start the server: In the terminal, execute:

   ollama serve

The command will start running the model at http://localhost:11434.

Proceed to step 4 Integrating with VSCode!

3. Using Jan

Download and install Jan: Choose the version for your system on the Jan website.
Download the model: I couldn't find DeepSeek R1 directly in Jan. So, I went to the Hugging Face website and manually searched for "unsloth gguf deepseek r1." I found the desired version, clicked the "Use this model" button, and selected Jan as the option. The model automatically opened in Jan, and I then downloaded it.
Load the model: After downloading, select the model and click Load.
Start the server: Jan automatically starts the server, usually at http://localhost:1337.
Proceed to step 4 Integrating with VSCode!

4. Integrating with VSCode

Install the extension: In VSCode, open the Extensions tab and install Cline or Roo Code.

Configure the extension for Jan or LM Studio: The configuration for both Cline and Roo Code is practically identical. Follow the steps below:
- Click on the extension and access "Settings".
- In API Provider, select "LM Studio".
- In the Base URL field, enter the URL configured in Jan or LM Studio.
- The Model ID field will be automatically filled if you only have one model available. Otherwise, manually select the DeepSeek model you downloaded.
- Finish by clicking "Done".

Configure the extension for Ollama:
- Click on the extension and access "Settings".
- In API Provider, select "Ollama".
- In the Base URL field, enter the URL configured in Ollama.
- The Model ID field will be automatically filled if you only have one model available. Otherwise, manually select the DeepSeek model you downloaded.
- Finish by clicking "Done".
Integration complete, now just enjoy the functionalities of Cline or Roo Code.

Conclusion

DeepSeek R1 is a lifesaver for those who want a powerful AI without spending anything. With LM Studio, Ollama, or Jan, you can run it locally and integrate it directly into Visual Studio Code. Choose the model that fits your PC and start using it today!

Comments 38 total

reddy developerJan 24, 2025
Article is really nice. Would you advise same on Nvidia nano products as well.
- Douglas ToledoJan 25, 2025
  I haven't tested any NVIDIA AI products yet, thanks for your message.
Essa SabbaghJan 25, 2025

They have 671b model,we need to have work station for it 😅🥲
- Douglas ToledoJan 25, 2025
  Yes, hahaha, it always depends on your needs.
- ZVHR El EkhsaanJan 26, 2025
  whoa, that 671b needs things like Ryzen Threadripper and 6x RTX A6000 to run smoothly on consumer Workstation PC
  but since it is DeepSeek it must requires less resource than the GPT 4o
Sharon FabinJan 25, 2025
Why would you advise to use an extreamly bad version like the 7B parameters one? If you're a software developer just use the closed source LLMs or DeepSeek R1 and pay for it's API instead of getting really bad results with the local LLM smh
- Roberto MaurizziJan 25, 2025
  Agreed, the 7b isn't really good for coding and the 8b is worse (it's distilled with llama instead of Qwen like the 7b)
- Douglas ToledoJan 25, 2025
  In the article I mention the option of paying for the API. I wouldn't pay if I had a machine with excellent hardware, which is one of the aims of this article. I think it's worthwhile for users to test it out locally first, as everyone has different needs.
- CaptainDigitalsJan 25, 2025
  Bolt.Diy if you want local development and pay for DeepSeek API key and develop locally is a good option.
  
  Not totally free but the API costs is so low it doesn't make sense to house it locally.
  
  Consider the costs with the token input/output for R1 millions for fractions of a penny. $2 load would last you for weeks or months.
  - Evan MarshallJan 31, 2025
    Point me in the direction of how to setup deepseek r1 api and integrate and replace copilot in vscode?
    - Douglas ToledoFeb 4, 2025
      Install Continue Visual Studio Code extension. marketplace.visualstudio.com/items...
      
      Go in "Select Model" > "+Add Chat Model".
      In "Provider" select "DeepSeek".
      In "Model" select the desired model (Coder or Chat).
      In "API Key" add your generated key on DeepSeek Platform (platform.deepseek.com/api_keys).
      
      Use Continue instead of Visual Studio Code.
      More details: continue.dev/
  - Mwelwa MwansaFeb 4, 2025
    one of the reasons to run it locally is to protect your data. the deepseek r1 servers are in China. the data laws there are different.
- ZVHR El EkhsaanJan 26, 2025
  "..pay for it's API instead of getting really bad results with the local LLM smh"
  
  I already pay my high end PC with super hardware inside, now I want to f*ck it with AI, squeeze every memory installed and flex the GPU as well.
- Evan MarshallJan 31, 2025
  Point me in the direction of how to setup deepseek r1 api and integrate and replace copilot in vscode?
  - Douglas ToledoFeb 4, 2025
    Install Continue Visual Studio Code extension. marketplace.visualstudio.com/items...
    
    Go in "Select Model" > "+Add Chat Model".
    In "Provider" select "DeepSeek".
    In "Model" select the desired model (Coder or Chat).
    In "API Key" add your generated key on DeepSeek Platform (platform.deepseek.com/api_keys).
    
    Use Continue instead of Visual Studio Code.
    More details: continue.dev/
- Mwelwa MwansaFeb 4, 2025
  the beauty with the local llms is that we can fine tune them to do what we want
  - Douglas ToledoFeb 7, 2025
    Yes, but you have to write a good prompt.
Varnit SharmaJan 25, 2025
7b is totally incompatible with roo or cline I would suggest to use continue.dev which gives slightly better results but only caveat is that it does not support automatic command running or files creation or edits.
Drift JohnsonJan 25, 2025
Nice article, I used the 32b version to write some python, it doesn't best Claude at all, but I feel it is very useful. Openwebui is a dream, but cline should be better, however some tasks deepseek R1 just doesn't understand.
- Douglas ToledoJan 25, 2025
  I need to do more research on Open WebUI, I confess I'm still not familiar with it yet.
Raj SinghJan 26, 2025
Smaller models work best with aider and somewhat with bolt.diy too if you know what you're doing and prompt them properly(but with limited tasks and in small codebase, hence work best with aider). They tend to loose context of what they did earlier and start hallucination very fast and get stuck.
I've tried all r1 models(qwen versions) upto 14b locally(via ollama and lm studio) on my gaming desktop and 32b via HF inference(free serverless api) with cline, roo-code, aider and Bolt.diy. Absolutely useless in cline/rooCline. 14b and 32b are usable with aider if you generate proper instruction, roadmap using powerful models for every phase and tasks. Also tried phi-4 recently, surprisingly okay for such a small model in it's tier!
- Hernan GarciaJan 26, 2025
  Having same experience with DS-7B, you mention that aider works better for 7B?
  - Hernan GarciaJan 26, 2025
    I just been playing a bit with deepseek-r1:7B ollama + aider and works quite decently specially with the --watch-files flag:
    
    aider --model ollama_chat/deepseek-r1:7b --watch-files
    - Hernan GarciaJan 26, 2025
      ok my best results so far are with:
      aider --model ollama_chat/deepseek-r1:7b --watch-files --browser
      it enables a GUI to explore files and edit files, a bit friendlier than the command line for multiple files.
      - Douglas ToledoJan 28, 2025
        I started coding with aider today. I'm impressed with it. Maybe I'll write about my experiences so far.
RashidJan 26, 2025
I have tried 7b with LM Studio with cline in vscode, even a simple prompt takes too longer to respond,
- Douglas ToledoJan 27, 2025
  Try download a lightweight model. If results are not so good for your needs I recommend pay for tokens and use their API key. I wish the best for you.
LeonJan 27, 2025
You would not write this article as is if you tested DeepSeek R1 locally with Cline yourself or maybe your test case was super simple.
I tested it with 7B parameters one and the one distilled with Qwen. And, it is really bad for any more or less complex coding task!
Maybe DeepSeek R1 is good for chatting or "reasoning" or whatever, but not for software development.
It is not capable of understanding technical requirements, nor to refine the requirements, nor to architect a solution. It is simply bad, specially when you compare it to Claude!
- Douglas ToledoJan 28, 2025
  Thank you very much for your comment and feedback.
  
  I tested with Cline, Deepseek models 1.5B, 7B and 8B. I've had to adapt to the limitations, my hardware and my needs. When I'm looking for something more accurate, I use API tokens.
  
  The article mentions the paid API option, the intention is to show the process of running locally, understanding, adjusting and adapting it to each person's reality.
rafaoneJan 28, 2025
are you trust in run a chinese model? it's can be a spyware.
- Posandu Jan 29, 2025
  😂
- Douglas ToledoFeb 1, 2025
  😂
- Jorge GuimaraesApr 4, 2025
  Do you trust to run an American model? They DO have spywares everywhere.
ExileonJan 30, 2025
I have tried to download 32b version and it keep saying, "Error: model requires more system memory (18.2 GiB) than is available (7.4 GiB)", but I have more than 60gb of free space, is there any way to fix it?
- Douglas ToledoFeb 1, 2025
  Are you using Ollama, LM Studio or Jan? Did you already fixed?
James m. Jim LawtherFeb 2, 2025
Expert Services for Cryptocurrency Recovery Make contact with Cyber space hack pro Group

I go by the Name James m. Jim Lawther currently residing here in CA Los Angeles, it's been 9 years have been working here in United State as a Nurse, I always like using Instagram to while away time for myself and I came in contact with a Chinese profile picture, the guy was awesome and he told me he was into broker trading that I could invest little with $5,000 and within three months I will have enough to buy myself a home and a hospital, so I decided to invest but after 3 months I tried to withdraw and I was asked for more and all this while he encouraged me to keep investing and I had already invested $73,000.00 USD with him on his broker platform and we started dating so that makes me confident in him but it was too late to know I have been scammed after he successfully rips me off. I told my close friend about it, luckily she knows a good hacker who she believes can help me recover all my lost money. I contacted the hacker through his email, At first, I was not very sure if he would be able to help me out. but since it was my friend who introduced the hacker to me. I decided to give it a try, to my greatest surprise. He was able to help me recover my stolen money within 71 hours. Have you been scammed before and you want to get your money back? Are you looking for a good hacker for a good purpose? Search no more. CYBERSPACE HACK PRO is the best solution to your problem. Email Cyberspacehackpro(@)rescueteam.com) or WhatsApp at +1 (559) (508) (2403)
DhanushFeb 18, 2025
Needs excellent hardware
Jouwert van GeeneMar 5, 2025
I've tried the approach but keep getting "API Request..." and nothing happens. See this github-thread github.com/cline/cline/issues/1407
I just wanted to use it for a hobby project on VSCode - Cline - Ollama - deepseek-coder, but get stuck.

Add comment

Douglas Toledo @dwtoledo