Published on

Edgen: Run GenAI Locally


Running GenAI models, particularly large language models (LLMs), on devices like laptops and smartphones poses significant challenges. These models demand considerable memory and computational power, traditionally available only in cloud-based environments. Edgen addresses these challenges by providing a local-API server that simplifies edge-based GenAI deployment. Leveraging Edgen is similar to using a OpenAI's API, but with the added benefits of on-device processing.

Advantages of On-device Inference

On-device inference brings several advantages when compared to cloud inference:

  • Free: It runs locally on hardware the user already owns.
  • Optimized: Edgen uses the latest techniques and runtimes to optimize the inference of GenAI models.
  • Scalable: More and more users? No need to increment cloud computing infrastructure. Just let your users use their own hardware.
  • No internet required: With on-device inference users don't need an internet connection.
  • Data Private: On-device inference means users' data never leave their devices.

Discover Edgen: Open Source with a Permissive License

What is Edgen?


⚡Edgen architecture overview

Why Edgen?

Quick integration with any platform

  • Python:

    from edgen import Edgen
    client = Edgen()
    completion =
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    for chunk in completion:
  • Node:

    import Edgen from "edgen";
    const client = new Edgen();
    async function main() {
      const completion = await{
        model: "default",
        messages: [
          {"role": "system", "content": "You are a helpful assistant."},
          {"role": "user", "content": "Hello!"}
        stream: true,
      for await (const chunk of completion) {

One of Edgen's key strengths is its versatility and ease of integration across various programming environments. Whether you're a Python aficionado, a C++ veteran, or a JavaScript enthusiast, Edgen caters to your preferred platform with minimal setup requirements.

Inference and Memory Optimization

The main obstacle to overcome when running GenAI models on-device is memory. Models like LLMs are big and require a lot of memory to run. Consumer-grade hardware is not ready for this, but there are many model compression techniques to reduce the memory footprint of GenAI models, such as: quantization, pruning, sparsification, knowledge distillation, to name a few.

Edgen leverages the latest techniques and runtimes to optimize the inference of GenAI models. This means that inference is fast and efficient even on low-end devices, and developers building their apps with Edgen don't need to be experts in ML optimization to get the best performance out of their models.

Rust: The Choice for Performance and Safety

Rust is central to Edgen's development for its exceptional performance and reliability. Here's why Rust makes Edgen stand out:

  • Performance: Combines C++-like speed with enhanced memory safety, perfect for high-performance needs.
  • Memory Safety: Offers a garbage collector-free environment, ensuring efficient resource management.
  • Safe Concurrency: Facilitates the development of complex, multi-threaded applications securely.
  • Evolving Ecosystem: A growing range of libraries and tools in Rust's ecosystem continually empowers Edgen.

The Edgen Chat Use Case

The functionality of Edgen is effectively demonstrated with Edgen Chat. This application is more than just a local chat tool; it includes advanced document functionalities, is able to learn from unstructured data and integrates with various data storage platforms.

Building Collaboratively: Edgen's Community & Open Source Ethos

The true strength of Edgen lies in its community-driven approach.

  • Community Contributions: Whether you're a developer, a data scientist, or an AI enthusiast, your insights and contributions are invaluable. From developing plugins to providing constructive feedback, every contribution is welcome.

  • Open Source Philosophy: We believe Open-Sourcing Edgen will create a transparent and collaborative environment with the goal of making advanced AI more accessible to a broader audience.

Edgen's Future Roadmap

Our team is committed to pushing the boundaries of what's possible in on-device AI.

  • Enhanced Model Support: We're always expanding our library to include the latest GenAI models available through Edgen's plug-and-play endpoints.

  • Software Integrations: Recognizing the importance of interoperability, we are working towards more integrations with popular software platforms. This will enable a smooth workflow for developers and users alike.

Enabling local and private GenAI with Edgen

Stepping Forward with AI Innovation

Edgen focuses on making AI more accessible, efficient, and respectful of privacy.

  • Experience Edgen: Give Edgen a try and see how it enhances your AI projects.
  • Join the Community: Be a part of our open-source community. Your input and contributions are crucial in shaping the path of Edgen and on-device AI.
  • Spread the Word: Share your experience with Edgen in our Discord or Subreddit.

Get started with Edgen.

Check out our Subreddit.

Join us on Discord.