Supercharge Your Coding with Local LLMs: A Step-by-Step Guide featuring Phi-3 Mini

Supercharge Your Coding with Local LLMs: A Step-by-Step Guide featuring Phi-3 Mini

Author: Rajesh Pandhare

As AI continues to revolutionize the tech industry, Large Language Models (LLMs) are emerging as powerful tools for enhancing coding efficiency and creativity. By running LLMs locally on your own hardware, you can enjoy enhanced privacy, offline availability, and reduced latency compared to cloud-based solutions. In this article, we’ll explore how Kanaka Software is empowering developers with cutting-edge AI solutions, focusing on the Phi-3 Mini model and providing a practical, step-by-step guide to get you started.

Understanding Local LLMs and Their Advantages

LLMs are advanced AI models trained on vast amounts of text data, enabling them to understand and generate human-like text. When applied to coding tasks, LLMs can assist with code generation, autocompletion, and even problem-solving. By running LLMs locally, you can:

  • Ensure your code and data remain private and secure: Sensitive codebases and proprietary data never leave your machine.
  • Work offline without relying on internet connectivity: Develop anywhere, anytime, without interruptions.
  • Experience faster response times due to reduced latency: Get near-instant feedback and code suggestions.

Tools like Ollama simplify the management of these models, while extensions like CodeGPT seamlessly integrate them into your development environment.

?

Hardware Considerations and LLM Selection

Before diving into the setup process, it’s essential to understand the hardware requirements for running LLMs locally. While high-end GPUs can handle larger models, most developers can still leverage the power of LLMs on average consumer hardware, such as an M1 Mac or Windows i5 machine with 16GB of RAM.

When selecting an LLM for local deployment, consider factors like model size, performance, and capabilities. Here are five noteworthy models to explore:

  • Phi-3 Mini (3.8B): Ideal for moderate-scale projects, offering quick integration and low latency. May struggle with very complex code structures due to its smaller size.
  • deepseek-coder (1.3B): This model is designed for efficiency, particularly in environments with limited computing resources. Deepseek-coder is suitable for developers who need a balance between performance and resource consumption.?
  • Codellama 7B: Excelling in versatility, Codellama 7B can handle a wide variety of programming languages and frameworks. Its robust capacity makes it ideal for comprehensive coding tasks, though its larger size might require more substantial hardware capabilities, potentially limiting its use in constrained environments.
  • Stable-code 3B: This model is particularly adept at handling general coding tasks and shines in scenarios requiring creative and context-aware coding suggestions, such as seasonal and event-driven projects.?
  • SQLCoder: Defog SQLCoder is a large language model (LLM) that can generate SQL queries from natural language descriptions. It is trained on a massive dataset of SQL queries and their corresponding natural language descriptions.

For this article, we’ll focus on the Phi-3 Mini model, which strikes a balance between power and efficiency on typical hardware.

?

Software Setup: Step-by-Step Guide

1. Installing Ollama

Ollama is a tool that simplifies the management of local LLMs. Follow these steps to install it on your system:

Windows:

  • Download Ollama: Visit the official Ollama website at?https://ollama.com/download ?and click on the download link for your operating system.
  • Installation Process: Locate the downloaded Ollama installer file. Double-click the installer to begin the installation process. Follow the installation wizard's instructions. You might need to agree to the license terms and choose an installation directory. Once the installation is complete, Ollama is ready to use on your Windows system. Press Win + S, type cmd for Command Prompt or powershell for PowerShell, and press Enter. Alternatively, you can open Windows Terminal if you prefer a more modern experience.

MacOS:

  • Download Ollama: Visit the official Ollama website at?https://ollama.com/download ?and click on the download link for your operating system.
  • Installation Process: Locate the downloaded Ollama zip file in ~/Downloads folder (for Apple Silicon Mac its Ollama-darwin.zip ). Double-click the zip ?to extract the content. This should extract Ollama.app to your ~/Downloads folder Follow the on-screen instructions provided by the installation wizard. This may include agreeing to license terms and selecting an installation directory. Drag Ollama.app to your Applications folder You can then delete the downloaded zip file to save space In Applications folder double click on Ollama, go through setup wizard where it should prompt you to install command line version(Ollama) On next step you can skip running the llama3 model and click on finish.

System Requirements:

  • Windows: Windows 10 or later. For optimal performance, especially with larger models, a dedicated NVIDIA GPU is recommended for automatic hardware acceleration.
  • macOS: macOS 11 Big Sur or later. 8GB RAM for running 3B models, 16GB for running 7B models, and 32GB for running 13B models. 12GB disk space for Ollama and base models (additional space required for storing model data). Any modern CPU with at least 4 cores is recommended. For running 13B models, a CPU with at least 8 cores is recommended. While a GPU is not required, it can improve performance, especially for larger models.

Additional Notes:

  • Optimizing Performance: Close unnecessary applications to free up system resources, especially when running large models. Keep your GPU drivers up to date.
  • Troubleshooting: If you encounter installation problems, ensure your system is up to date and you have sufficient permissions to install new software. Running the installer as an administrator can sometimes help.

?

2. Downloading and Running the Phi-3 Mini Model

With Ollama installed, you can easily download and run the Phi-3 Mini model:

  • Open Terminal (macOS) or Command Prompt (Windows).
  • Run?

:> ollama run phi3:3.8b-mini-instruct-4k-fp16

  • phi3:3.8b-mini-instruct-4k-fp16 this model from Phi3 family is instruction-tuned, making it particularly effective for following coding-related instructions and providing intelligent code suggestions. It is optimized for local deployment, ensuring low latency and enhanced privacy.

?

3. Installing CodeGPT

CodeGPT is a VSCode extension that enables seamless integration with local LLMs. To install it:

?

4. Configuring CodeGPT for Phi-3 Mini

To connect CodeGPT with the Phi-3 Mini model:

  • Click on the CodeGPT chat icon in the left-hand sidebar.

  • Follow the steps shown in the screenshot above.

?

Practical Demo: Building a Golang REST API

Let’s put Phi-3 Mini to the test by building a simple REST API in Go that handles CRUD operations for a ‘book’ resource (title, author, ISBN). Here’s the prompt we’ll use:

“Build a simple REST API in Go that handles CRUD operations for a ‘book’ resource (title, author, ISBN).”

Using CodeGPT with Phi-3 Mini

  • Open a new Go project in VSCode.
  • Create a new file, such as main.go.
  • In the CodeGPT Chat, enter the prompt: "Build a simple REST API in Go that handles CRUD operations for a 'book' resource (title, author, ISBN)."
  • The phi3:3.8b-mini-instruct-4k-fp16 model will generate boilerplate code for the REST API.


?

Refining and Extending the Generated Code

Review the generated code and modify it as needed. Always remember, this is AI-generated code, so use it as a copilot; the main pilot is you.

Use CodeGPT to interact with Phi-3 Mini for generating additional code snippets or suggestions.

Refine the API by adding error handling, validation, and other features.

Here’s an example of how the generated code might look:

package main

import (
    "encoding/json"
    "fmt"
    "log"
    "net/http"
    "strconv"
    "time"
    "github.com/gorilla/mux"
)

type Book struct {
    Title      string  `json:"title"`
    Author     string  `json:"author"`
    ISBN       string  `json:"isbn"`
    CreatedAt  time.Time `json:"created_at,omitempty"`
}

var books []Book

func getAllBooks(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(books)
}

func createBook(w http.ResponseWriter, r *http.Request) {
    var newBook Book
    _ = json.NewDecoder(r.Body).Decode(&newBook)
    newBook.CreatedAt = time.Now()
    books = append(books, newBook)
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(newBook)
}

?
func getBook(w http.ResponseWriter, r *http.Request) {
    params := mux.Vars(r)
    id, _ := strconv.Atoi(params["id"])
    for i, book := range books {
        if id == i {
            json.NewEncoder(w).Encode(book)
            return
        }
    }
    http.Error(w, "Book not found", http.StatusNotFound)
}

func updateBook(w http.ResponseWriter, r *http.Request) {
    params := mux.Vars(r)
    id, _ := strconv.Atoi(params["id"])
    var updatedBook Book
    _ = json.NewDecoder(r.Body).Decode(&updatedBook)
    for i, _ := range books {
        if id == i {
            books[i] = updatedBook
            w.Header().Set("Content-Type", "application/json")
            json.NewEncoder(w).Encode(updatedBook)
            return
        }
    }
    http.Error(w, "Book not found", http.StatusNotFound)
}

func deleteBook(w http.ResponseWriter, r *http.Request) {
    params := mux.Vars(r)
    id, _ := strconv.Atoi(params["id"])
    for i, _ := range books {
        if id == i {
            books = append(books[:i], books[i+1:]...)
            w.WriteHeader(http.StatusNoContent)
            return
        }
    }
    http.Error(w, "Book not found", http.StatusNotFound)
}

func main() {

    fmt.Println("Welcome to Book Library")
    router := mux.NewRouter()

router.HandleFunc("/books", getAllBooks).Methods("GET")
router.HandleFunc("/books", createBook).Methods("POST")
router.HandleFunc("/books/{id}", getBook).Methods("GET")
router.HandleFunc("/books/{id}", updateBook).Methods("PUT")
router.HandleFunc("/books/{id}", deleteBook).Methods("DELETE")

    log.Fatal(http.ListenAndServe(":8080", router))
}
        

To test code :

curl -X GET https://localhost:8080/books 

curl -X POST -H "Content-Type: application/json" -d '{"title":"New Book", "author":"Author Name", "isbn":"1234567890"}' https://localhost:8080/books 

curl -X GET https://localhost:8080/books/{id}

curl -X DELETE https://localhost:8080/books/{id}?        

Final Thoughts

By leveraging local LLMs like Phi-3 Mini, you can supercharge your coding efficiency and creativity. Kanaka Software is committed to empowering developers with cutting-edge AI solutions, and we encourage you to explore the world of local LLMs and experiment with different models and applications.

If you’re interested in learning more about Kanaka Software or would like to stay updated on our latest developments, please visit our?website ?and follow us on?LinkedIn .

Remember, the future of coding is here, and it’s powered by AI!


Ariz Shaikh

LLM Business Transformation Consultant | Data Analytics | Master Six Sigma Black Belt

6 个月

There are SLM ( Small Langauage model ) Phi-3 is also SLM , which is use by the organization to train with domain knowledge , or integrate with Knowledge Bank. USE CASES ( Ecommerce Site integrated with SLM ) (Digital Marketer - DM ) Marketing Use case. Use Case: Enhancing the Online Shopping Experience. Scenario: A customer visits an e-commerce website to purchase a new laptop. How the DM can help: ? The customer can use the natural language search bar to ask questions like "What's the difference between an i5 and i7 processor for video editing?" ? The DM provides clear explanations tailored to the customer's query, helping them understand the differences and make an informed choice. ? As the customer browses laptops, the DM recommends personalized options based on their search history and behavior. ? On product pages, the customer can interact with the DM through a chat window, asking questions about specific features, battery life, or compatibility with software they use. ? Before checkout, the DM suggests complementary accessories or extended warranties based on the customer's selected laptop.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了