Repository avatar
Other Tools
v1.0.0
active

free-crypto-news

io.github.nirholas/free-crypto-news

Free crypto news aggregator MCP - Bitcoin Ethereum DeFi Solana altcoins feeds

Documentation

๐Ÿ†“ Free Crypto News API

๐Ÿšจ January 19, 2026 UPDATE: All free resources from Vercel have been used. I will need to consider rate limiting, a freemium API, or find sponsorship to pay for the upkeep. I do not mind paying here and there but unfortunately it will cut off the ability for me to develop. I apologize if you were using the API and it went down, you may host your own on Vercel, Railway, locally, and numerous other ways as well. If you need assistance deploying this repo, please let me know and I will be glad to assist you. Will update the README when I get the API going again ๐Ÿšจ

Get real-time crypto news from 7 major sources with one API call.

๐ŸŒ Available in 18 languages: ไธญๆ–‡ | ๆ—ฅๆœฌ่ชž | ํ•œ๊ตญ์–ด | Espaรฑol | Franรงais | Deutsch | Portuguรชs | ะ ัƒััะบะธะน | ุงู„ุนุฑุจูŠุฉ | More...

curl https://free-crypto-news.vercel.app/api/news

That's it. It just works.


Free Crypto NewsCryptoPanicOthers
Price๐Ÿ†“ Free forever$29-299/moPaid
API KeyโŒ None neededRequiredRequired
Rate LimitUnlimited*100-1000/dayLimited
Sources71Varies
Self-hostโœ… One clickNoNo

Sources

We aggregate from 7 trusted outlets:

  • ๐ŸŸ  CoinDesk โ€” General crypto news
  • ๐Ÿ”ต The Block โ€” Institutional & research
  • ๐ŸŸข Decrypt โ€” Web3 & culture
  • ๐ŸŸก CoinTelegraph โ€” Global crypto news
  • ๐ŸŸค Bitcoin Magazine โ€” Bitcoin maximalist
  • ๐ŸŸฃ Blockworks โ€” DeFi & institutions
  • ๐Ÿ”ด The Defiant โ€” DeFi native

Endpoints

EndpointDescription
/api/newsLatest from all sources
/api/search?q=bitcoinSearch by keywords
/api/defiDeFi-specific news
/api/bitcoinBitcoin-specific news
/api/breakingLast 2 hours only
/api/trendingTrending topics with sentiment
/api/analyzeNews with topic classification
/api/statsAnalytics & statistics
/api/sourcesList all sources
/api/healthAPI & feed health status
/api/rssAggregated RSS feed
/api/atomAggregated Atom feed
/api/opmlOPML export for RSS readers
/api/docsInteractive API documentation
/api/webhooksWebhook registration
/api/archiveHistorical news archive
/api/pushWeb Push notifications
/api/originsFind original news sources
/api/portfolioPortfolio-based news + prices

SDKs & Components

PackageDescription
React<CryptoNews /> drop-in components
TypeScriptFull TypeScript SDK
PythonZero-dependency Python client
JavaScriptBrowser & Node.js SDK
GoGo client library
PHPPHP SDK

Base URL: https://free-crypto-news.vercel.app

Failsafe Mirror: https://nirholas.github.io/free-crypto-news/

Query Parameters

ParameterEndpointsDescription
limitAll news endpointsMax articles (1-50)
source/api/newsFilter by source
from/api/newsStart date (ISO 8601)
to/api/newsEnd date (ISO 8601)
page/api/newsPage number
per_page/api/newsItems per page
hours/api/trendingTime window (1-72)
topic/api/analyzeFilter by topic
sentiment/api/analyzebullish/bearish/neutral
feed/api/rss, /api/atomall/defi/bitcoin

Response Format

{
  "articles": [
    {
      "title": "Bitcoin Hits New ATH",
      "link": "https://coindesk.com/...",
      "description": "Bitcoin surpassed...",
      "pubDate": "2025-01-02T12:00:00Z",
      "source": "CoinDesk",
      "timeAgo": "2h ago"
    }
  ],
  "totalCount": 150,
  "fetchedAt": "2025-01-02T14:30:00Z"
}

Integration Examples

Pick your platform. Copy the code. Ship it.


๐Ÿ Python

Zero dependencies. Just copy the file.

curl -O https://raw.githubusercontent.com/nirholas/free-crypto-news/main/sdk/python/crypto_news.py
from crypto_news import CryptoNews

news = CryptoNews()

# Get latest news
for article in news.get_latest(5):
    print(f"๐Ÿ“ฐ {article['title']}")
    print(f"   {article['source']} โ€ข {article['timeAgo']}")
    print(f"   {article['link']}\n")

# Search for topics
eth_news = news.search("ethereum,etf", limit=5)

# DeFi news
defi = news.get_defi(5)

# Bitcoin news
btc = news.get_bitcoin(5)

# Breaking (last 2 hours)
breaking = news.get_breaking(5)

One-liner:

import urllib.request, json
news = json.loads(urllib.request.urlopen("https://free-crypto-news.vercel.app/api/news?limit=5").read())
print(news["articles"][0]["title"])

๐ŸŸจ JavaScript / TypeScript

Works in Node.js and browsers.

TypeScript SDK (npm)

npm install @nicholasrq/crypto-news
import { CryptoNews } from '@nicholasrq/crypto-news';

const client = new CryptoNews();

// Fully typed responses
const articles = await client.getLatest(10);
const health = await client.getHealth();

Vanilla JavaScript

curl -O https://raw.githubusercontent.com/nirholas/free-crypto-news/main/sdk/javascript/crypto-news.js
import { CryptoNews } from './crypto-news.js';

const news = new CryptoNews();

// Get latest
const articles = await news.getLatest(5);
articles.forEach(a => console.log(`${a.title} - ${a.source}`));

// Search
const eth = await news.search("ethereum");

// DeFi / Bitcoin / Breaking
const defi = await news.getDefi(5);
const btc = await news.getBitcoin(5);
const breaking = await news.getBreaking(5);

One-liner:

const news = await fetch("https://free-crypto-news.vercel.app/api/news?limit=5").then(r => r.json());
console.log(news.articles[0].title);

๐Ÿค– ChatGPT (Custom GPT)

Build a crypto news GPT in 2 minutes.

  1. Go to chat.openai.com โ†’ Create GPT
  2. Click Configure โ†’ Actions โ†’ Create new action
  3. Paste this OpenAPI schema:
openapi: 3.1.0
info:
  title: Free Crypto News
  version: 1.0.0
servers:
  - url: https://free-crypto-news.vercel.app
paths:
  /api/news:
    get:
      operationId: getNews
      summary: Get latest crypto news
      parameters:
        - name: limit
          in: query
          schema:
            type: integer
            default: 10
  /api/search:
    get:
      operationId: searchNews
      summary: Search crypto news
      parameters:
        - name: q
          in: query
          required: true
          schema:
            type: string
  /api/defi:
    get:
      operationId: getDefiNews
      summary: Get DeFi news
  /api/bitcoin:
    get:
      operationId: getBitcoinNews
      summary: Get Bitcoin news
  /api/breaking:
    get:
      operationId: getBreakingNews
      summary: Get breaking news
  1. No authentication needed
  2. Save and test: "What's the latest crypto news?"

Full schema: chatgpt/openapi.yaml


๐Ÿ”ฎ Claude Desktop (MCP)

Add crypto news to Claude Desktop.

1. Clone & install:

git clone https://github.com/nirholas/free-crypto-news.git
cd free-crypto-news/mcp && npm install

2. Add to config

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "crypto-news": {
      "command": "node",
      "args": ["/path/to/free-crypto-news/mcp/index.js"]
    }
  }
}

Restart Claude. Ask: "Get me the latest crypto news"

Or run locally:

cd mcp && npm install && node index.js

Full code: mcp/


๐Ÿฆœ LangChain

from langchain.tools import tool
import requests

@tool
def get_crypto_news(limit: int = 5) -> str:
    """Get latest cryptocurrency news from 7 sources."""
    r = requests.get(f"https://free-crypto-news.vercel.app/api/news?limit={limit}")
    return "\n".join([f"โ€ข {a['title']} ({a['source']})" for a in r.json()["articles"]])

@tool
def search_crypto_news(query: str) -> str:
    """Search crypto news by keyword."""
    r = requests.get(f"https://free-crypto-news.vercel.app/api/search?q={query}")
    return "\n".join([f"โ€ข {a['title']}" for a in r.json()["articles"]])

# Use in your agent
tools = [get_crypto_news, search_crypto_news]

Full example: examples/langchain-tool.py


๐ŸŽฎ Discord Bot

const { Client, EmbedBuilder } = require('discord.js');

client.on('messageCreate', async (msg) => {
  if (msg.content === '!news') {
    const { articles } = await fetch('https://free-crypto-news.vercel.app/api/breaking?limit=5').then(r => r.json());
    
    const embed = new EmbedBuilder()
      .setTitle('๐Ÿšจ Breaking Crypto News')
      .setColor(0x00ff00);
    
    articles.forEach(a => embed.addFields({ 
      name: a.source, 
      value: `[${a.title}](${a.link})` 
    }));
    
    msg.channel.send({ embeds: [embed] });
  }
});

Full bot: examples/discord-bot.js


๐Ÿค– Telegram Bot

from telegram import Update
from telegram.ext import Application, CommandHandler
import aiohttp

async def news(update: Update, context):
    async with aiohttp.ClientSession() as session:
        async with session.get('https://free-crypto-news.vercel.app/api/news?limit=5') as r:
            data = await r.json()
    
    msg = "๐Ÿ“ฐ *Latest Crypto News*\n\n"
    for a in data['articles']:
        msg += f"โ€ข [{a['title']}]({a['link']})\n"
    
    await update.message.reply_text(msg, parse_mode='Markdown')

app = Application.builder().token("YOUR_TOKEN").build()
app.add_handler(CommandHandler("news", news))
app.run_polling()

Full bot: examples/telegram-bot.py


๐ŸŒ HTML Widget

Embed on any website:

<script>
async function loadNews() {
  const { articles } = await fetch('https://free-crypto-news.vercel.app/api/news?limit=5').then(r => r.json());
  document.getElementById('news').innerHTML = articles.map(a => 
    `<div><a href="${a.link}">${a.title}</a> <small>${a.source}</small></div>`
  ).join('');
}
loadNews();
</script>
<div id="news">Loading...</div>

Full styled widget: widget/crypto-news-widget.html


๐Ÿ–ฅ๏ธ cURL / Terminal

# Latest news
curl -s https://free-crypto-news.vercel.app/api/news | jq '.articles[:3]'

# Search
curl -s "https://free-crypto-news.vercel.app/api/search?q=bitcoin,etf" | jq

# DeFi news
curl -s https://free-crypto-news.vercel.app/api/defi | jq

# Pretty print titles
curl -s https://free-crypto-news.vercel.app/api/news | jq -r '.articles[] | "๐Ÿ“ฐ \(.title) (\(.source))"'

Self-Hosting

One-Click Deploy

Deploy with Vercel

Manual

git clone https://github.com/nirholas/free-crypto-news.git
cd free-crypto-news
pnpm install
pnpm dev

Open http://localhost:3000/api/news

Environment Variables

All environment variables are optional. The project works out of the box with zero configuration.

VariableDefaultDescription
OPENAI_API_KEY-Enables i18n auto-translation (18 languages)
OPENAI_PROXY_URLapi.openai.comCustom OpenAI endpoint
REDDIT_CLIENT_ID-Enables Reddit social signals
REDDIT_CLIENT_SECRET-Reddit OAuth secret
X_AUTH_TOKEN-X/Twitter signals via XActions
ARCHIVE_DIR./archiveArchive storage path
API_URLProduction VercelAPI endpoint for archive collection

Feature Flags

VariableDefaultDescription
FEATURE_MARKETtrueMarket data (CoinGecko, DeFiLlama)
FEATURE_ONCHAINtrueOn-chain events (BTC stats, DEX volumes)
FEATURE_SOCIALtrueSocial signals (Reddit sentiment)
FEATURE_PREDICTIONStruePrediction markets (Polymarket, Manifold)
FEATURE_CLUSTERINGtrueStory clustering & deduplication
FEATURE_RELIABILITYtrueSource reliability tracking

GitHub Secrets (for Actions)

For full functionality, add these secrets to your repository:

OPENAI_API_KEY      # For i18n translations
REDDIT_CLIENT_ID    # For Reddit data (register at reddit.com/prefs/apps)
REDDIT_CLIENT_SECRET
X_AUTH_TOKEN        # For X/Twitter (from XActions login)

Tech Stack

  • Runtime: Next.js 14 Edge Functions
  • Hosting: Vercel free tier
  • Data: Direct RSS parsing (no database)
  • Cache: 5-minute edge cache

Contributing

PRs welcome! Ideas:

  • More news sources (Korean, Chinese, Japanese, Spanish)
  • Sentiment analysis โœ… Done
  • Topic classification โœ… Done
  • WebSocket real-time feed
  • Rust / Ruby SDKs
  • Mobile app (React Native)

New Features

๐Ÿ“ก RSS Feed Output

Subscribe to the aggregated feed in any RSS reader:

https://free-crypto-news.vercel.app/api/rss
https://free-crypto-news.vercel.app/api/rss?feed=defi
https://free-crypto-news.vercel.app/api/rss?feed=bitcoin

๐Ÿฅ Health Check

Monitor API and source health:

curl https://free-crypto-news.vercel.app/api/health | jq

Returns status of all 7 RSS sources with response times.

๐Ÿ“– Interactive Docs

Swagger UI documentation:

https://free-crypto-news.vercel.app/api/docs

๐Ÿ”” Webhooks

Register for push notifications:

curl -X POST https://free-crypto-news.vercel.app/api/webhooks \
  -H "Content-Type: application/json" \
  -d '{"url": "https://your-server.com/webhook", "secret": "your-secret"}'

๐Ÿ“Š Trending & Analytics

Trending Topics

curl https://free-crypto-news.vercel.app/api/trending?hours=24

Returns topics with sentiment (bullish/bearish/neutral) and mention counts.

News with Classification

# Get all analyzed news
curl https://free-crypto-news.vercel.app/api/analyze

# Filter by topic
curl "https://free-crypto-news.vercel.app/api/analyze?topic=DeFi"

# Filter by sentiment
curl "https://free-crypto-news.vercel.app/api/analyze?sentiment=bullish"

Statistics

curl https://free-crypto-news.vercel.app/api/stats

Returns articles per source, hourly distribution, and category breakdown.


๐Ÿ“ฆ SDKs

LanguageInstall
TypeScriptnpm install @nicholasrq/crypto-news
Pythoncurl -O .../sdk/python/crypto_news.py
Gogo get github.com/nirholas/free-crypto-news/sdk/go
PHPcurl -O .../sdk/php/CryptoNews.php
JavaScriptcurl -O .../sdk/javascript/crypto-news.js

See /sdk for documentation.


๐Ÿค– Integrations


๐Ÿ“š Historical Archive

Query historical news data stored in GitHub:

# Get archive statistics
curl "https://free-crypto-news.vercel.app/api/archive?stats=true"

# Query by date range
curl "https://free-crypto-news.vercel.app/api/archive?start_date=2025-01-01&end_date=2025-01-07"

# Search historical articles
curl "https://free-crypto-news.vercel.app/api/archive?q=bitcoin&limit=50"

# Get archive index
curl "https://free-crypto-news.vercel.app/api/archive?index=true"

Archive is automatically updated every 6 hours via GitHub Actions.


๐Ÿ›ก๏ธ Failsafe Mirror

If the main Vercel deployment is down, use the GitHub Pages backup:

Failsafe URL

https://nirholas.github.io/free-crypto-news/

Static JSON Endpoints

EndpointDescription
/cache/latest.jsonLatest cached news (hourly)
/cache/bitcoin.jsonBitcoin news cache
/cache/defi.jsonDeFi news cache
/cache/trending.jsonTrending topics cache
/cache/sources.jsonSource list
/archive/index.jsonHistorical archive index

Status Page

https://nirholas.github.io/free-crypto-news/status.html

Real-time monitoring of all API endpoints with auto-refresh.

How It Works

  1. GitHub Actions runs every hour to cache data from main API
  2. GitHub Pages serves the static JSON files
  3. Failsafe page auto-detects if main API is down and switches to cache
  4. Archive workflow runs every 6 hours to store historical data

Client-Side Failsafe Pattern

const MAIN_API = 'https://free-crypto-news.vercel.app';
const FAILSAFE = 'https://nirholas.github.io/free-crypto-news';

async function getNews() {
  try {
    // Try main API first (5s timeout)
    const controller = new AbortController();
    setTimeout(() => controller.abort(), 5000);
    
    const res = await fetch(`${MAIN_API}/api/news`, { signal: controller.signal });
    if (res.ok) return res.json();
    throw new Error('API error');
  } catch {
    // Fallback to GitHub Pages cache
    const res = await fetch(`${FAILSAFE}/cache/latest.json`);
    return res.json();
  }
}

๐Ÿ” Original Source Finder

Track where news originated before being picked up by aggregators:

# Find original sources for recent news
curl "https://free-crypto-news.vercel.app/api/origins?limit=20"

# Filter by source type
curl "https://free-crypto-news.vercel.app/api/origins?source_type=government"

# Search specific topic
curl "https://free-crypto-news.vercel.app/api/origins?q=SEC"

Source types: official, press-release, social, blog, government

Identifies sources like SEC, Federal Reserve, Binance, Coinbase, Vitalik Buterin, X/Twitter, etc.


๐Ÿ”” Web Push Notifications

Subscribe to real-time push notifications:

// Get VAPID public key
const { publicKey } = await fetch('https://free-crypto-news.vercel.app/api/push').then(r => r.json());

// Register subscription
await fetch('https://free-crypto-news.vercel.app/api/push', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    subscription: pushSubscription,
    topics: ['bitcoin', 'breaking', 'defi']
  })
});

๐ŸŽจ Embeddable Widgets

News Ticker

<div id="crypto-ticker" class="crypto-ticker" data-auto-init>
  <div class="crypto-ticker-label">๐Ÿ“ฐ CRYPTO</div>
  <div class="crypto-ticker-track"></div>
</div>
<script src="https://nirholas.github.io/free-crypto-news/widget/ticker.js"></script>

News Carousel

<div id="crypto-carousel" class="crypto-carousel" data-auto-init>
  <div class="crypto-carousel-viewport">
    <div class="crypto-carousel-track"></div>
  </div>
</div>
<script src="https://nirholas.github.io/free-crypto-news/widget/carousel.js"></script>

See full widget examples in /widget


๐Ÿ—„๏ธ Archive v2: The Definitive Crypto News Record

We're building the most comprehensive open historical archive of crypto news. Every headline. Every hour. Forever.

What's in v2

FeatureDescription
Hourly collectionEvery hour, not every 6 hours
Append-onlyNever overwrite - every unique article preserved
DeduplicationContent-addressed IDs prevent duplicates
Entity extractionAuto-extracted tickers ($BTC, $ETH, etc.)
Named entitiesPeople, companies, protocols identified
Sentiment scoringEvery headline scored positive/negative/neutral
Market contextBTC/ETH prices + Fear & Greed at capture time
Content hashingSHA256 for integrity verification
Hourly snapshotsWhat was trending each hour
IndexesFast lookups by source, ticker, date
JSONL formatStreamable, append-friendly, grep-able

V2 API Endpoints

# Get enriched articles with all metadata
curl "https://free-crypto-news.vercel.app/api/archive/v2?limit=20"

# Filter by ticker
curl "https://free-crypto-news.vercel.app/api/archive/v2?ticker=BTC"

# Filter by sentiment
curl "https://free-crypto-news.vercel.app/api/archive/v2?sentiment=positive"

# Get archive statistics
curl "https://free-crypto-news.vercel.app/api/archive/v2?stats=true"

# Get trending tickers (last 24h)
curl "https://free-crypto-news.vercel.app/api/archive/v2?trending=true"

# Get market history for a month
curl "https://free-crypto-news.vercel.app/api/archive/v2?market=2026-01"

Archive Directory Structure

archive/
  v2/
    articles/           # JSONL files, one per month
      2026-01.jsonl     # All articles from January 2026
    snapshots/          # Hourly trending state
      2026/01/11/
        00.json         # What was trending at midnight
        01.json         # What was trending at 1am
        ...
    market/             # Price/sentiment history
      2026-01.jsonl     # Market data for January 2026
    index/              # Fast lookups
      by-source.json    # Article IDs grouped by source
      by-ticker.json    # Article IDs grouped by ticker
      by-date.json      # Article IDs grouped by date
    meta/
      schema.json       # Schema version and definition
      stats.json        # Running statistics

Enriched Article Schema

{
  "id": "a1b2c3d4e5f6g7h8",
  "schema_version": "2.0.0",
  "title": "BlackRock adds $900M BTC...",
  "link": "https://...",
  "canonical_link": "https://... (normalized)",
  "description": "...",
  "source": "CoinTelegraph",
  "source_key": "cointelegraph",
  "category": "bitcoin",
  "pub_date": "2026-01-08T18:05:00.000Z",
  "first_seen": "2026-01-08T18:10:00.000Z",
  "last_seen": "2026-01-08T23:05:00.000Z",
  "fetch_count": 5,
  "tickers": ["BTC"],
  "entities": {
    "people": ["Larry Fink"],
    "companies": ["BlackRock"],
    "protocols": ["Bitcoin"]
  },
  "tags": ["institutional", "price"],
  "sentiment": {
    "score": 0.65,
    "label": "positive",
    "confidence": 0.85
  },
  "market_context": {
    "btc_price": 94500,
    "eth_price": 3200,
    "fear_greed_index": 65
  },
  "content_hash": "h8g7f6e5d4c3b2a1",
  "meta": {
    "word_count": 23,
    "has_numbers": true,
    "is_breaking": false,
    "is_opinion": false
  }
}

๐Ÿš€ Roadmap

Building the definitive open crypto intelligence platform.

โœ… Complete

  • Real-time aggregation from 7 sources
  • REST API with multiple endpoints
  • RSS/Atom feeds
  • SDKs (Python, JavaScript, TypeScript, Go, PHP, React)
  • MCP server for AI assistants
  • Embeddable widgets
  • Archive v2 with enrichment
  • Hourly archive collection workflow
  • Entity/ticker extraction
  • Sentiment analysis
  • Market context capture (CoinGecko + DeFiLlama)
  • Story clustering engine
  • Source reliability tracking
  • On-chain event tracking (Bitcoin, DeFi TVL, DEX volumes, bridges)
  • X/Twitter social signals via XActions (no API key needed!)
  • Prediction market tracking (Polymarket, Manifold)
  • AI training data exporter
  • Analytics engine with daily/weekly digests

๐Ÿ”จ In Progress

  • Full test of enhanced collection pipeline
  • LunarCrush / Santiment social metrics

๐Ÿ“‹ Short-Term (Q1 2026)

Data Enrichment

  • Full article extraction (where legally permissible)
  • AI-powered summarization (1-sentence, 1-paragraph)
  • Advanced entity extraction with AI
  • Event classification (funding, hack, regulation, etc.)
  • Claim extraction (factual claims as structured data)
  • Relationship extraction (who did what to whom)

Multi-Lingual

  • i18n workflow with 18 languages (auto-translation via OpenAI)
  • Translated README and docs
  • Korean sources (Crypto primers, etc.)
  • Chinese sources (8btc, etc.)
  • Japanese sources
  • Spanish sources

Real-Time Features

  • WebSocket streaming
  • Faster webhook delivery
  • Real-time alert conditions

๐Ÿ“‹ Medium-Term (Q2-Q3 2026)

Intelligence Layer (Partial - In Progress)

  • Story clustering (group related articles) โœ…
  • Headline mutation tracking (detect changes)
  • Source first-mover tracking (who breaks news) โœ…
  • Coordinated narrative detection โœ…
  • Prediction tracking & accuracy scoring
  • Anomaly detection (unusual coverage patterns) โœ…

Social Intelligence (Partial - In Progress)

  • X/Twitter integration via XActions (browser automation - FREE!) โœ…
  • Discord public channel monitoring
  • Telegram channel aggregation
  • Influencer reliability scoring

On-Chain Correlation (Partial - In Progress)

  • Link news to on-chain events
  • Whale movement correlation (structure ready) โœ…
  • DEX volume correlation โœ…
  • Bridge volume tracking โœ…
  • Coverage gap analysis (what's NOT being covered)

AI Products

  • The Oracle: Natural language queries over all data
  • The Brief: Personalized AI-generated digests
  • The Debate: Multi-perspective synthesis
  • The Counter: Fact-checking as a service

๐Ÿ“‹ Long-Term (2027+)

Research Infrastructure

  • Causal inference engine
  • Backtesting infrastructure
  • Hypothesis testing platform
  • Academic access program

Trust & Verification

  • Content-addressed storage (IPFS-style)
  • Periodic merkle roots anchored to blockchain
  • Deep fake / AI content detection
  • Source network forensics

Formats & Access (Partial - In Progress)

  • Parquet exports for analytics
  • SQLite monthly exports
  • Embedding vectors for semantic search (export ready) โœ…
  • LLM fine-tuning ready datasets โœ…

The Meta-Play

  • Industry-standard reference for disputes
  • Academic citation network
  • AI training data licensing
  • Prediction registry (timestamped predictions with outcomes)

๐Ÿ“‚ Archive v2 Data Structure

The enhanced archive system captures comprehensive crypto intelligence:

archive/v2/
โ”œโ”€โ”€ articles/              # JSONL, append-only articles
โ”‚   โ””โ”€โ”€ 2026-01.jsonl     # ~50 new articles per hour
โ”œโ”€โ”€ market/               # Full market snapshots
โ”‚   โ””โ”€โ”€ 2026-01.jsonl     # CoinGecko + DeFiLlama data
โ”œโ”€โ”€ onchain/              # On-chain events
โ”‚   โ””โ”€โ”€ 2026-01.jsonl     # BTC stats, DEX volumes, bridges
โ”œโ”€โ”€ social/               # Social signals
โ”‚   โ””โ”€โ”€ 2026-01.jsonl     # Reddit sentiment, trending
โ”œโ”€โ”€ predictions/          # Prediction markets
โ”‚   โ””โ”€โ”€ 2026-01.jsonl     # Polymarket + Manifold odds
โ”œโ”€โ”€ snapshots/            # Hourly trending snapshots
โ”‚   โ””โ”€โ”€ 2026/01/11/
โ”‚       โ””โ”€โ”€ 08.json       # Complete state at 08:00 UTC
โ”œโ”€โ”€ analytics/            # Generated insights
โ”‚   โ”œโ”€โ”€ digest-2026-01-11.json
โ”‚   โ”œโ”€โ”€ narrative-momentum.json
โ”‚   โ””โ”€โ”€ coverage-patterns.json
โ”œโ”€โ”€ exports/training/     # AI-ready exports
โ”‚   โ”œโ”€โ”€ instruction-tuning.jsonl
โ”‚   โ”œโ”€โ”€ qa-pairs.jsonl
โ”‚   โ”œโ”€โ”€ sentiment-dataset.jsonl
โ”‚   โ”œโ”€โ”€ embeddings-data.jsonl
โ”‚   โ””โ”€โ”€ ner-training.jsonl
โ”œโ”€โ”€ index/                # Fast lookups
โ”‚   โ”œโ”€โ”€ by-source.json
โ”‚   โ”œโ”€โ”€ by-ticker.json
โ”‚   โ””โ”€โ”€ by-date.json
โ””โ”€โ”€ meta/
    โ”œโ”€โ”€ schema.json
    โ”œโ”€โ”€ stats.json
    โ””โ”€โ”€ source-stats.json # Reliability scores

Per-Article Data

Each article is enriched with:

{
  "id": "sha256:abc123...",
  "schema_version": "2.0.0",
  "title": "Bitcoin Surges Past $100K",
  "link": "https://...",
  "description": "...",
  "source": "CoinDesk",
  "source_key": "coindesk",
  "pub_date": "2026-01-11T10:00:00Z",
  "first_seen": "2026-01-11T10:05:00Z",
  "last_seen": "2026-01-11T18:05:00Z",
  "fetch_count": 8,
  "tickers": ["BTC", "ETH"],
  "categories": ["market", "bitcoin"],
  "sentiment": "bullish",
  "market_context": {
    "btc_price": 100500,
    "eth_price": 4200,
    "fear_greed": 75,
    "btc_dominance": 52.3
  }
}

Hourly Snapshot Data

Each hour captures:

  • Articles: Count, sentiment breakdown, top tickers, source distribution
  • Market: Top 100 coins, DeFi TVL, yields, stablecoins, trending
  • On-Chain: BTC network stats, DEX volumes, bridge activity
  • Social: Reddit sentiment, active users, trending topics
  • Predictions: Polymarket/Manifold crypto prediction odds
  • Clustering: Story clusters, first-movers, coordinated releases

Why This Matters

Time is our moat.

If we capture complete data now with proper structure, in 2 years we'll have something nobody can recreate. The compound value:

  • Year 1: Interesting dataset
  • Year 3: Valuable for research
  • Year 5: Irreplaceable historical record
  • Year 10: The definitive source, cited in papers, used by institutions

Every day we delay proper archiving is data lost forever.


License

MIT ยฉ 2025 nich


Stop paying for crypto news APIs.
Made with ๐Ÿ’œ for the community