Repository avatar
Search & Data Extraction
v1.0.0
active

criterion

io.github.BalajSaleem/criterion

Semantic search across 6,236 Quran verses and 12,416 authentic Hadiths for Islamic guidance.

Documentation

Criterion - Islamic Knowledge Assistant

An AI-powered Da'i (invitor to Islam) bringing authentic Islamic guidance to seekers worldwide.

Built on the Quran and authentic Hadith. Free forever. For the sake of Allah.

Mission · Features · Tech Stack · Getting Started · MCP Server · Documentation


Mission

Criterion exists to bring authentic Islamic knowledge to anyone seeking truth, using modern technology to make divine guidance accessible to all of humanity — freely, forever, for the sake of Allah alone.

Our Four Pillars

  1. Truth & Authenticity — Every response is grounded in verified sources (Quran and Sahih Hadith). We never fabricate or hallucinate.
  2. Fundamentals & Simplicity — We focus on core Islamic teachings that unite. We avoid sectarian debates and controversial topics.
  3. For the Sake of Allah — Criterion will always be free, with no monetization or organizational promotion. This is Sadaqah Jariyah.
  4. State of the Art — We use cutting-edge AI to deliver Islamic guidance effectively to the masses.

👉 Read the full mission and vision in MISSION.md

Key Differentiators

Mission-Aligned:

  • Free Forever — No paywalls, no ads, no monetization. Built fi sabilillah (for Allah's sake)
  • 📚 Fundamentals-Focused — Avoids sectarian debates, focuses on universally accepted Islamic teachings
  • 🛡️ Trust-First — Grade-filtered authentic Hadith (defaults to Sahih), verified sources only
  • 🤝 Seeker-Oriented — Designed for curious minds, new Muslims, and students of knowledge
  • 🕌 Da'i Personality — Compassionate, knowledgeable, humble guidance

Technical Excellence:

  • 🎯 Semantic Search — Natural language queries return relevant verses from 6,236 Quran verses + 12,416 Hadith narrations
  • 📖 Contextual Retrieval — Top results include ±2 surrounding verses/narrations for proper context
  • 🌐 Multilingual — Read in English (fast) + Slovak (expandable to 10+ languages)
  • 🔗 Accurate Citations — All responses include source references with hyperlinks (Quran.com, Sunnah.com)
  • Fast — <150ms query response time

Features

What Criterion Does

Semantic Quran Search — Ask natural language questions, get relevant verses
Semantic Hadith Search — Search authentic Hadith with grade & collection filtering
Contextual Understanding — Top results include surrounding context for proper meaning
Accurate Citations — Every response cites real sources with hyperlinks
Multilingual Reading — English (fast) + Slovak (single JOIN <200ms)
Shareable URLs/quran/search?q=patience, /hadith/search?q=charity, and /quran/2/255 with metadata
Real-time Streaming — Progressive response generation with token-by-token delivery
Tool-Based RAG — LLM autonomously decides when to retrieve from Quran/Hadith

Technical Stack

How It Works

The RAG Pipeline

User Question
    ↓
XAI Grok 4 LLM (decides which tools to use)
    ↓
Tool Selection:
  - queryQuran → 6,236 verses (top 7 for chat, top 20 for search)
  - queryHadith → 12,416 hadiths (top 3 for chat, top 15 for search, with grade filtering)
    ↓
Vector Search (768-dim Gemini embeddings)
    ↓
Context Enhancement (top 3 get ±2 surrounding verses)
    ↓
LLM Generates Response with Citations
    ↓
Real-time Stream to User (Server-Sent Events)

Data

  • 6,236 Quran verses from all 114 Surahs

    • Arabic text (Tanzil Quran)
    • English translation (master)
    • Slovak translation (expandable)
    • 768-dimensional embeddings (Gemini text-embedding-004)
  • 12,416 Hadith narrations from 4 collections

    • Sahih Bukhari (7,558)
    • Sahih Muslim (2,920)
    • 40 Hadith Nawawi (42)
    • Riyad as-Salihin (1,896)
    • Grade filtering (Sahih, Hasan, Da'if)
    • 768-dimensional embeddings

Performance

  • Quran search: <150ms (English), <200ms (translated)
  • Hadith search: <150ms
  • Vector search: Powered by HNSW indexes
  • Streaming: Real-time token-by-token delivery

Getting Started

Prerequisites

  • Node.js 18+ and pnpm
  • PostgreSQL database (recommend Neon)
  • API Keys:
    • XAI API Key (for Grok LLM)
    • Google AI Studio API Key (for embeddings)

Installation

  1. Clone the repository
git clone <repo-url>
cd criterion
  1. Install dependencies
pnpm install
  1. Set up environment variables

Create a .env.local file:

# Database
POSTGRES_URL=postgresql://...

# AI APIs
XAI_API_KEY=xai-...
GOOGLE_GENERATIVE_AI_API_KEY=...

# Authentication (optional)
AUTH_SECRET=...
  1. Enable pgvector extension
pnpm db:enable-pgvector
  1. Run database migrations
pnpm db:migrate
  1. Ingest Quran data (generates embeddings for 6,236 verses)
pnpm ingest:quran

This will take 10-15 minutes to complete.

  1. Test the Quran search
pnpm test:quran
  1. Start the development server
pnpm dev

Your app should now be running on localhost:3000.

Available Commands

Development

pnpm dev          # Start dev server
pnpm build        # Build for production
pnpm start        # Start production server

Database

pnpm db:generate  # Generate Drizzle schema
pnpm db:migrate   # Run migrations
pnpm db:studio    # Open Drizzle Studio (GUI)

Data Ingestion & Testing

# Quran
pnpm clear:quran         # Clear all Quran data
pnpm ingest:quran        # Ingest Quran verses and generate embeddings
pnpm ingest:quran:slovak # Ingest Slovak translation
pnpm test:quran          # Test Quran search functionality

# Hadith
pnpm clear:hadith  # Clear all Hadith data
pnpm ingest:hadith # Ingest Hadith and generate embeddings

MCP Server

Criterion exposes its semantic search capabilities through the Model Context Protocol (MCP), allowing AI assistants like Claude Desktop and Cursor to search Quran and Hadith directly.

Quick Setup:

{
  "mcpServers": {
    "criterion": {
      "url": "https://criterion.life/api/mcp"
    }
  }
}

Available Tools:

  • search_quran — Search 6,236 Quran verses
  • search_hadith — Search 12,416 authentic Hadiths
  • get_verse — Retrieve specific verse by reference (e.g., "2:255")

👉 Read full MCP documentation in MCP.md

Project Structure

criterion/
├── app/
│   ├── (auth)/          # Authentication routes
│   ├── (chat)/          # Chat interface and API
│   │   └── api/chat/    # Main chat endpoint
│   ├── search/          # Quran search page
│   │   └── api/         # Quran search API
│   ├── hadith/
│   │   └── search/      # Hadith search page and API
│   └── quran/           # Quran reading pages
├── lib/
│   ├── ai/
│   │   ├── embeddings.ts     # Core RAG logic
│   │   ├── prompts.ts        # Da'i system prompts
│   │   └── tools/
│   │       ├── query-quran.ts   # Quran search tool
│   │       └── query-hadith.ts  # Hadith search tool
│   └── db/
│       ├── schema.ts         # Database schema
│       └── migrations/       # SQL migrations
├── components/
│   ├── chat.tsx              # Main chat UI
│   ├── quran-verses.tsx      # Quran display component
│   ├── hadith-narrations.tsx # Hadith carousel
│   └── hadith/
│       └── hadith-card.tsx   # Reusable hadith card
├── scripts/
│   ├── ingest-quran.ts       # Quran data ingestion
│   ├── ingest-hadith.ts      # Hadith data ingestion
│   └── test-*.ts             # Test scripts
└── data/
    ├── quran*.txt            # Quran translations
    └── *-full.json           # Hadith collections

Documentation

Understanding Criterion

  • MISSION.md — Our vision, values, and deeper purpose. Read this first to understand why we build Criterion.
  • CRITERION_DETAILED.md — Comprehensive technical documentation including architecture, implementation history, and performance metrics.
  • CRITERION.md — Quick reference guide for setup and key concepts.

Key Sections

DocumentPurpose
MISSION.mdVision, values, pillars, and long-term goals
CRITERION_DETAILED.mdTechnical architecture, database schema, components, and best practices
CRITERION.mdQuick start, commands, and core concepts
README.mdGetting started, features, and project overview

Architecture Overview

components/
├── Chat UI (QuranVerses, HadithNarrations, MessageActions)
├── Search Pages (Quran and Hadith semantic search with filters)
├── Hadith Components (reusable HadithCard for search and chat)
├── Quran Pages (shared components for context, language selection)
└── UI Components (buttons, inputs, etc.)

lib/
├── ai/
│   ├── embeddings.ts (vector search logic)
│   ├── prompts.ts (Da'i system prompts)
│   └── tools/ (queryQuran, queryHadith, requestSuggestions)
├── db/
│   ├── schema.ts (Drizzle ORM definitions)
│   └── queries.ts (database functions)
└── monitoring/ (performance tracking)

app/
├── (chat)/api/chat (main chat endpoint)
├── quran/search/ (Quran search page and API)
├── hadith/search/ (Hadith search page and API)
├── quran/ (Quran reading pages)
└── (auth)/ (authentication)

Data Attribution

  • Quran Text: Tanzil.net — Creative Commons Attribution 3.0
  • Quran Translations: Multiple sources with proper attribution
  • Hadith Collections: Sunnah.com, IslamicNetwork.com
  • Embeddings: Google Gemini text-embedding-004

Our Commitment

Criterion is built with these commitments:

  • Never monetize Islamic knowledge
  • Always cite sources with proper references
  • Never fabricate verses or hadiths
  • Focus on fundamentals — avoid sectarian debates
  • Build for the community — this belongs to all Muslims and benefits all humanity
  • Stay at the forefront — leverage state-of-the-art technology

Contributing

We welcome contributions from developers, scholars, and community members who share our mission. Please see CONTRIBUTING.md for guidelines.

License

  • Quran Text: Creative Commons Attribution 3.0 (Tanzil.net)
  • Hadith Data: From verified Islamic sources with proper attribution
  • Code: See LICENSE file for details

"Invite to the way of your Lord with wisdom and good instruction, and argue with them in a way that is best." — Quran 16:125

May Allah accept this work and make it a means of guidance for seekers everywhere. Ameen.