Modus AI Swiss Toolkit

Modus AI Swiss Toolkit

Introduction

Why spend hours automating something you could do manually in just 10 minutes?

When we think about AI, many of us imagine it replacing humans. While that might be possible someday, today is not that day. Developers have long been building abstractions to automate their tedious and stressful tasks. For this hackathon, after hours of brainstorming, I decided it was time to create an "AI Swiss Army Knife Tool" that would take automation to the next level—beyond what code alone could achieve.

Popular websites like Jam.dev and DevToys simplify tasks by offering tools such as converters and code formatters. These are invaluable when you're in a pinch. However, the primary limitation of these tools is their lack of analytical capabilities. Developers must account for all edge cases for the tool to be useful, and the results are often static.

That's where my tool comes in—it features an AI-powered code editor with a copilot function, video-to-transcript conversion with AI summarization, image-to-text capabilities, and more.


Project Demo

You may watch the project demo. https://youtu.be/tiLvLzRzPJI


What is Hypermode?

Hypermode, as I understand it, is a platform-as-a-service (PaaS) that allows developers to integrate AI models without worrying about infrastructure or complex configurations. Hypermode simplifies AI model integration by letting developers write functions using the Modus CLI or code, abstracting away the technical complexities. Think of it as Ollama for production environments or an AI SDK for backend developers, akin to Vercel’s AI SDK but tailored for backend use.


Getting Started with Hypermode

First, register with Hypermode using your Gmail or GitHub account. I recommend using GitHub since you may need to link your repository later, saving time by linking upfront.

signup-page

Click the New Project button and give your project a memorable name.

hypermode create project


Modus CLI: The Heart of Hypermode

The Modus CLI is a powerful toolkit that enables developers to build AI projects using languages like Golang and AssemblyScript (a TypeScript-like language). It outputs a fully functional GraphQL server, making it easy to query the AI functions.

Installing Modus CLI

Ensure you have the latest Node.js installed, then run the following command:

npm install -g @hypermode/modus-cli

Once installed, scaffold a new Modus project by running:

modus new

Follow the on-screen instructions. I chose AssemblyScript because of its familiar TypeScript-like syntax.


Project Structure

This project leverages Next.js and AssemblyScript, and here’s how the file structure looks:

├── app
├── asconfig.json
├── assembly // AssemblyScript functions reside here
├── build // Modus build artifacts (WASM files, env files)
├── modus.json
├── next.config.mjs
├── node_modules
├── package-lock.json
├── package.json
├── tsconfig.json

Assembly folder structure:

// assembly folder
├── google // Utility folder with helper functions
├── index.ts // AssemblyScript file
└── tsconfig.json

Linking Your Project in Hypermode

To deploy your functions for public use, link your project to Hypermode. You can either:

  1. Commit all changes to GitHub and link via the Hypermode console.

  2. Use the Modus CLI and execute hyp link.

In this guide, we’ll use GitHub. Follow the dashboard instructions, then create a ci-modus-build.yml file in .github/workflows/.

name: ci-modus-build
on:
  workflow_dispatch:
  pull_request:
  push:
    branches:
      - main

env:
  MODUS_DIR: ""

permissions:
  contents: read

jobs:
  build:
    name: Build
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Locate modus.json
        id: set-dir
        run: |
          MODUS_JSON=$(find $(pwd) -name 'modus.json' -print0 | xargs -0 -n1 echo)
          if [ -n "$MODUS_JSON" ]; then
            MODUS_DIR=$(dirname "$MODUS_JSON")
            echo "MODUS_DIR=$MODUS_DIR" >> $GITHUB_ENV
          else
            echo "modus.json not found"
            exit 1
          fi

      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: "22"

      - name: Setup Go
        uses: actions/setup-go@v5

      - name: Setup TinyGo
        uses: acifani/setup-tinygo@v2
        with:
          tinygo-version: "0.34.0"

      - name: Build project
        run: npx -p @hypermode/modus-cli -y modus build
        working-directory: ${{ env.MODUS_DIR }}
        shell: bash

      - name: Publish GitHub artifact
        uses: actions/upload-artifact@v4
        with:
          name: build
          path: ${{ env.MODUS_DIR }}/build/*

Push the changes to GitHub, and your builds will appear on the Hypermode dashboard.


Setting Up Your Project for Development

Hypermode supports various AI models. Specify them in modus.json. Here’s an example configuration:

{
  "connections": {
    "google": {
      "type": "http",
      "endpoint": "https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent",
      "headers": {
        "X-goog-api-key": "{{GOOGLE_API_KEY}}"
      }
    }
  },
  "models": {
    "text-generator": {
      "sourceModel": "meta-llama/Meta-Llama-3.1-8B-Instruct",
      "provider": "hugging-face",
      "connection": "hypermode"
    }
  }
}

You can set your connection environment variables here Sidebar > Settings > Connections as shown in the image below:

Project Connections

## variables are defined using the following syntax
MODUS_<CONNECTION_NAME_IN_CAPS>_<VARIABLE_NAME> 

## eg
MODUS_ASSEMBLYAI_ASSEMBLY_API_KEY 

## Usage in modus config file
"Bearer {{ASSEMBLYAI_API_KEY}}" 

## usage in .env.dev.local or any other .env file 
MODUS_ASSEMBLYAI_ASSEMBLY_API_KEY=""

The following below shows the models you can integrate into your project.

Supported models


Using AssemblyScript in Hypermode

Here’s a simple example of an AssemblyScript function for generating AI responses from images:


@json
class ReqContent {
  parts!: Part[];
}

@json
class Part {
  @omitnull()
  text: string | null = null;

  @omitnull()
  inline_data: InlineData | null = null;
}


@json
class InlineData {
  mime_type!: string;
  data!: string;
}

// Utility class which is used for Gemini Image prompt
@json
export class GeminiImagePrompt {
  contents: ReqContent[];
  constructor(prompt: string, base64encoded: string) {
    this.contents = [
      {
        parts: [
          { text: prompt, inline_data: null },
          {
            text: null,
            inline_data: {
              mime_type: "image/jpeg",
              data: base64encoded,
            },
          },
        ],
      },
    ];
  }
}
import { GeminiImagePrompt } from "./google";

export function aiImageToText(base64Img: string, prompt: string | null = null): string {
  let header = new Headers();
  let template = `
  Extract data from the image, expecially the text.
  ${prompt ? `Additional info: ${prompt}` : ""}
  `;

  const body = http.Content.from(new GeminiImagePrompt(template, base64Img));

  const request = new http.Request(
    `https://generativelanguage.googleapis.com/v1beta/models/gemini-1.5-flash:generateContent`,
    { body, method: "POST", headers: header }
  );

  const response = http.fetch(request);
  if (!response.ok) {
    throw new Error(`Failed to analyze image: ${response.status} ${response.text()}`);
  }

  return response.json<GeminiGenerateOutput>().text;
}

By integrating Hypermode and leveraging AssemblyScript, we can create highly scalable AI solutions with minimal effort.

The function aiImageToText is a simplified example demonstrating how to interact with AI models using AssemblyScript. It converts a base64-encoded image to text by sending a request to the Gemini API. You can expand this further by adding error handling, logging, or using a different AI model depending on your requirements.

Testing Locally with Modus CLI

After defining the function, you can test it locally using the Modus CLI:

modus dev

This command starts a local GraphQL server, allowing you to test your AI functions before deploying them. You can query your functions using GraphQL clients like Postman, Insomnia, or even the built-in GraphQL playground provided by Hypermode.

Testing in Postman:

The Modus CLI spins up a fast GraphQL server when we run modus dev, which can be accessed at http://localhost:8686/graphql. Using Postman, we can test the API before integrating it with Next.js.

As seen in the query tab on the left, Postman introspected the server and returned all the functions we defined in AssemblyScript. We can query them using GraphQL, which is very cool.

GraphQL schema in Postman

The best part is that we can pass arguments and receive results directly from the LLM or connections. In the image below, we query the ZenQuotes API for a random quote.

Executing the randomQuote function

As expected, we received a response with a life-related quote.

Viewing the results


GraphQL Codegen:

Currently, maintaining a separate TypeScript interface for our GraphQL API queries and responses, while handling loading states and error management, can be challenging. We can improve the developer experience by adding Codegen and TanStack Query to our stack.

# Install the following dependencies:
npm i @graphql-codegen/cli @graphql-codegen/schema-ast @parcel/watcher @0no-co/graphqlsp

We'll use @graphql-codegen/cli to generate the GraphQL schema definitions and TypeScript interfaces, minimizing the risk of errors.

import type { CodegenConfig } from "@graphql-codegen/cli";

const config: CodegenConfig = {
  schema: "http://localhost:8686/graphql", // Modus CLI server
  documents: ["./**/*.tsx"],
  ignoreNoDocuments: true,
  generates: {
    "./graphql/": {
      preset: "client",
      config: {
        documentMode: "string",
      },
    },
    "./schema.graphql": {
      plugins: ["schema-ast"],
      config: {
        includeDirectives: true,
      },
    },
  },
};

export default config;

To generate the code, simply execute:

npx graphql-codegen --config codegen.ts

Usage with TanStack Query:

We'll add a utility function to query the GraphQL API. It accepts the GraphQL schema as the first argument and variables as the second.

import type { TypedDocumentString } from "./graphql";

const headers: Record<string, string> = {
  "Content-Type": "application/json",
  Accept: "application/graphql-response+json",
};

export async function execute<TResult, TVariables>(
  query: TypedDocumentString<TResult, TVariables>,
  ...[variables]: TVariables extends Record<string, never> ? [] : [TVariables]
) {
  const response = await fetch('GRAPHQL_URL', {
    method: "POST",
    headers,
    body: JSON.stringify({
      query,
      variables,
    }),
  });

  if (!response.ok) {
    throw new Error("Network response was not ok");
  }

  return response.json() as Promise<{ data: TResult }>;
}

Using it with TanStack Query is intuitive:

import { graphql } from "@/graphql";
import { execute } from "@/graphql/execute";

const aiImageToText = graphql(`
  query AiImageToText($base64Img: String!, $prompt: String!) {
    aiImageToText(base64Img: $base64Img, prompt: $prompt)
  }
`);

// Type-safe query with loading states and error handling
const { data, isPending, error, mutate } = useMutation({
  mutationKey: ["convert_to_text"],
  mutationFn: (args: { base64Img: string; prompt: string }) =>
    execute(aiImageToText, args),
});

Prompts:

This hackathon project best aligns with the following categories:

  • Honorable Mention

  • Best Multi-Model App

  • Grand Prize


Demo:

The deployed code is available on Vercel, and the source code can be found on GitHub.

Deploying to Hypermode

Once your functions are tested and ready, you can deploy them using GitHub Actions as described earlier. Ensure all your environment variables are correctly set up in the Hypermode dashboard.

After pushing your changes, Hypermode will automatically build and deploy your project. You can monitor the progress on the Hypermode dashboard. Once deployed, your GraphQL endpoint will be publicly accessible, and you can use it in production.

Potential Use Cases:

Let’s explore a few potential use cases for this AI Swiss Knife tool:

  1. AI-Powered Transcriptions and Summaries:
    Convert video or audio files into transcripts and summarize them using AI models. This is particularly useful for content creators or professionals attending lengthy meetings.

  2. Image-to-Text Extraction:
    Extract text from images using AI models, making it easier to process scanned documents, handwritten notes, or even text in different languages.

  3. Intelligent Code Completion:
    Integrate a Copilot-like feature that provides intelligent code suggestions, allowing developers to focus on solving complex problems rather than writing boilerplate code.

Conclusion

I hope this tool will assist my fellow developers in their day-to-day activities. However, it is not optimized for security and speed, so please use it with caution since there are no rate limits. The future of development lies in combining human intelligence with AI capabilities. Platforms like Hypermode simplify this process, allowing developers to focus on building innovative solutions without worrying about infrastructure.

By participating in this hackathon, I hope to inspire more developers to leverage AI in their projects and explore new possibilities. Whether it’s automating mundane tasks or creating entirely new experiences, AI is here to enhance our capabilities, not replace them.

With tools like Hypermode and Modus CLI, integrating AI into your projects has never been easier. So, why not take that extra step and see how AI can revolutionize your workflow? Happy coding!


References: