AI
min read
Last update on

How to run DeepSeek-R1:1.5b LLM on android using Termux

How to run DeepSeek-R1:1.5b LLM on android using Termux
Table of contents

Large Language Models (LLMs) have revolutionized natural language processing, unlocking capabilities that were once unimaginable. However, deploying these models on mobile devices has remained a complex challenge, requiring solutions that balance computational demands with hardware limitations.

Running an LLM like DeepSeek-R1:1.5b on an Android device pushes the boundaries of on-device AI, offering powerful natural language generation without relying on cloud infrastructure. 

For instance, this setup enables local processing of advanced tasks like real-time decision-making systems or automated code debugging, making it both efficient and accessible.

The ability to achieve this on mobile opens up possibilities for real-time, offline use cases, particularly in areas where connectivity or privacy is critical. But to make this possible, a robust environment is essential—this is where Termux becomes the key. 

This guide walks you through the process, ensuring you can set up and run the DeepSeek-R1:1.5b LLM efficiently while also exploring ways to extend it into a free API for broader application integration.

Understanding reasoning models

Reasoning models are specialized language models designed to excel at logical reasoning, problem-solving, and structured thinking tasks. 

Unlike common language models that primarily focus on generating coherent text by predicting the next word in a sequence, reasoning models go a step further by integrating structured reasoning mechanisms. These mechanisms enable them to perform logical inference, step-by-step problem-solving, and decision-making, making them adept at handling complex tasks that require more than just text generation.

They often integrate symbolic reasoning components and are trained using techniques such as GPRO (Gradient-based Probabilistic Optimization) and supervised fine-tuning on datasets containing explicit reasoning tasks. This enables them to perform structured reasoning and handle complex problem-solving scenarios effectively.

This specialization makes reasoning models particularly effective in domains requiring precise and logical decision-making, such as mathematical problem-solving, legal reasoning, and complex question-answering systems.

DeepSeek-R1: Architecture and Training Approach

DeepSeek-R1 is a reinforcement learning-driven LLM designed to enhance reasoning capabilities while maintaining structured, user-friendly outputs. Unlike traditional LLMs that depend on supervised fine-tuning (SFT) and RLHF, DeepSeek-R1 optimizes reasoning through self-improving reinforcement learning (RL), reducing reliance on human-labeled data.

Architecture of DeepSeek-R1

1. Cold Start Fine-Tuning for Stability

DeepSeek-R1 builds upon DeepSeek-V3-Base but starts with a small, high-quality dataset of structured Chain of Thought (CoT) reasoning.

  • This phase stabilizes RL training by ensuring the model learns structured reasoning before reinforcement learning begins.
  • Unlike models that jump directly into RL, this step prevents instability and improves interpretability.

2. Reinforcement Learning for Reasoning

DeepSeek-R1 uses Group Relative Policy Optimization (GRPO) to refine reasoning.

  • GRPO replaces the critic model by optimizing grouped response evaluations, making RL more efficient and stable.
  • Rule-based rewards ensure accuracy in math, coding, and logical reasoning while format rewards enforce structured, readable outputs.
  • A language consistency reward prevents issues like mixed-language responses.

3. Supervised Fine-Tuning for Generalization

After RL optimizes reasoning, DeepSeek-R1 undergoes SFT to enhance non-reasoning tasks such as writing, translation, and factual QA.

  • Uses rejection sampling to filter high-quality reasoning samples for consistency.
  • Avoids training on massive, unfiltered datasets, ensuring better alignment with structured reasoning.

4. Final Reinforcement Learning for Human Alignment

A second RL stage refines helpfulness and safety after reasoning is optimized.

  • Unlike traditional RLHF models that prioritize alignment first, DeepSeek-R1 focuses on reasoning first, alignment second, ensuring strong logical performance without sacrificing clarity.
  • Rule-based rewards optimize reasoning, while human preference models refine responses for usability and safety.

Device requirements

Before proceeding with the installation, ensure your Android device meets these minimum requirements for running the DeepSeek-R1:1.5B model on your phone:

  • Android version: 7.0 (Nougat) or higher
  • Available storage: At least 8GB free space
  • RAM: Minimum 6GB, recommended 8GB or more
  • Processor: Snapdragon 845 or equivalent (8 cores recommended)
  • Architecture: ARM64 (aarch64)

Note: The model's performance will vary significantly based on your device's specifications. Higher-end devices will provide better response times and more stable operation.

Prerequisites

1. An Android device with sufficient storage and processing power.

 2. Termux app installed on your device. Download the Termux APK from Termux Releases.

Prerequisites

 3. A stable internet connection.

Termux is an Android terminal emulator and Linux environment app that is crucial for this setup. Ensure your device meets these prerequisites before proceeding.

Troubleshooting common issues

Memory management

If you encounter out-of-memory errors:

  • Close background apps before running the model
  • Use a swap file to extend available memory:

dd if=/dev/zero of=/data/swap bs=1M count=2048
mkswap /data/swap
swapon /data/swap

Performance optimization

To improve model performance:

  • Enable high-performance mode on your device if available
  • Set CPU governor to performance:

su
echo "performance" > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

Note: This requires root access and may affect battery life.

Step-by-step guide

1. Set up the Termux environment

 First, ensure that your Termux environment is properly configured. Run the following commands: 

termux-setup-storage
pkg upgrade
pkg install git cmake golang

  • termux-setup-storage: Grants Termux access to your device's file storage.
  • pkg upgrade: Updates Termux packages to their latest versions.
  • pkg install git cmake golang: Installs the necessary tools, including Git, CMake, and Go.

Additional recommended packages:

pkg install python clang make wget

These steps prepare your Termux environment for installing and running advanced software like Ollama.

2. Install Ollama

Ollama is a tool that allows you to run LLMs locally. Here's how to set it up:

git clone --depth 1 https://github.com/ollama/ollama.git
cd ollama
go generate ./...
go build .
./ollama serve &

  • git clone: Clones the Ollama repository.
  • go generate ./...: Generates code required for the project.
  • go build .: Builds the Ollama binary.
  • ./ollama serve &: Starts the Ollama server in the background.

Important configuration tips:

  • Set GOMAXPROCS to optimize for your device:

export GOMAXPROCS=$(nproc)

  • Configure memory limits:

export OLLAMA_HOST=127.0.0.1
export OLLAMA_MODELS=/data/data/com.termux/files/home/models

3. Run DeepSeek-R1:1.5b

With the server running, you can now execute the DeepSeek-R1:1.5b LLM:

./ollama run deepseek-r1:1.5b

Model configuration options:

./ollama run deepseek-r1:1.5b --ram-limit 4GB --ctx-size 2048

Performance monitoring:

top -p $(pgrep ollama)

4. Creating an API endpoint

To expose your local model as an API endpoint:

pkg install nginx

Create a simple proxy configuration:

server {
    listen 8080;
    location /v1 {
        proxy_pass http://127.0.0.1:11434;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }
}

Start the server:

ginx -c /path/to/config

Your API will be accessible at: http://your-device-ip:8080/v1/generate

5. Security considerations

  • Use API keys for authentication
  • Implement rate limiting
  • Monitor system resources
  • Regular security updates:

pkg upgrade
pkg update

Monitoring and maintenance

Resource usage tracking

#!/bin/bash
while true; do
   ps aux | grep ollama
   free -h
   sleep 5
done

Automatic recovery

Create a simple watchdog script:

#!/bin/bash
while true; do
   if ! pgrep -x "ollama" > /dev/null; then
       echo "Restarting Ollama..."
       ./ollama serve &
   fi
   sleep 60
done

Model performance comparison across benchmarks

Model AIME 2024 (pass@1) AIME 2024 (cons@64) MATH-500 (pass@1) GPQA Diamond (pass@1) LiveCodeBench (ch-pas@1) Codeforces rating
GPT-4o-0513 9.2 13.4 74.6 49.9 32.9 759
Claude-3.5-Sonnet-1022 16.0 26.7 78.3 65.0 38.9 717
o1-mini 63.6 80.0 90.0 60.0 53.8 1820
QwQ-32B-Preview 44.0 60.0 90.6 54.5 41.9 1316
DeepSeek-R1-Distill-Qwen-1.5B 28.9 52.7 83.9 33.8 16.9 954
DeepSeek-R1-Distill-Qwen-7B 55.5 83.3 92.8 49.1 37.6 1189
DeepSeek-R1-Distill-Qwen-14B 69.7 80.0 93.9 59.1 53.1 1481
DeepSeek-R1-Distill-Qwen-32B 72.6 83.3 94.3 62.1 57.2 1691
DeepSeek-R1-Distill-Llama-8B 50.4 80.0 89.1 49.0 39.6 1205
DeepSeek-R1-Distill-Llama-70B 70.0 86.7 94.5 65.2 57.5 1633
Gemini 2.0 Flash (Experimental) 63.0% N/A 89.7% 61.2% 35.1% N/A
o1 79.8 83 96.4 75.7 76.6 1820*
DeepSeek-R1 79.8 86.7 97.3 71.5 65.9 2029


Model performance across benchmarks can vary and is subject to change due to updates, new evaluations, or changes in testing conditions. Always verify with the latest data.

Benefits and applications

Running DeepSeek-R1:1.5b locally on Android has several advantages:

1. Low-cost API endpoint

  • Build a chatbot that processes user queries
  • Integrate with mobile apps
  • Create custom REST APIs
  • Implement webhooks for automation

2. Offline capabilities

  • Local processing
  • No internet dependency
  • Reduced latency
  • Better privacy

3. Data privacy

  • Complete data control
  • No external service dependency
  • Compliance friendly
  • Perfect for sensitive applications

4. Customization

  • Model fine-tuning options
  • Integration flexibility
  • Prompt engineering
  • Performance optimization

Conclusion

Deploying the DeepSeek-R1:1.5b Large Language Model (LLM) on your Android device using Termux transforms your smartphone into a powerful AI tool. 

This setup enables advanced natural language processing tasks directly on your device, ensuring privacy, reducing latency, and eliminating reliance on cloud services.

By following this guide, you’ve transformed your Android device into a powerful tool capable of handling advanced reasoning tasks such as real-time code generation, interactive debugging, and context-aware query resolution. The integration with Termux provides a robust Linux environment, enabling seamless operation of the DeepSeek-R1:1.5b model to deliver sophisticated, on-device AI capabilities..

Extending this setup into a free API endpoint enhances its versatility, allowing integration with various applications and services—a capability particularly valuable in scenarios where real-time processing and data privacy are paramount.

As you continue to explore and optimize this configuration, monitor system performance and manage resources effectively to maintain optimal operation. Stay updated with the latest developments in LLM deployment on mobile platforms to leverage new features and improvements.

Embracing on-device AI not only enhances your Android device's functionality but also contributes to the broader movement toward decentralized, open source and accessible artificial intelligence solutions.

Written by
Editor
Ananya Rakhecha
Tech Advocate