Community judging

Projects & companies

Every team building at BuilderShip — what they're building, plus their demo, website, and socials. Sign in to rate each project 1–5 stars — everyone gets to judge.

724 projects

Yaroslav Volovich▶ Video

Own your slop! Today agents write all your code, make your presentations, write your emails and even send your Slack messages. The problem? YOU have to own these. YOU have to understand what agents did! This is a universal problem, especially for larger organizations. It doesn't matter HOW the work was done, what DOES matter is that YOU are responsible for it, that you OWN it. Our app takes an artifact (PRs, Slides etc), pretty much anything accessible via composio (and thus Tavily) and forces you to explain to the agent. To use all your senses and make it maximally computer<->human brain efficient you (user) have to WRITE ON PAPER! Agent (and the app) are configfured to use Apple's AR Kit to correctly see your drawings. The agent guides and corrects you during the session. Agent is specifically prompted to keep up to date model of your understanding and gaps you need to close. This make it the most efficient way to OWN your agent's work.

View details →Sign in to score →

Loopy

Top 5Score 9.9
Arjun Shah

Loopy is the fully autonomous product-building agent that builds, tests, and just keeps going until it thinks it's done. Give it a starting point, and it researches the market, plans the product, builds it, tests it, tracks progress in Linear, fixes issues, iterates, and keeps going until it decides the product is ready. It sends the user constant emails with progress updates, and users can even provide their input to Loopy, and it will incorporate that in realtime. Loopy uses Nebius to power the main agent, Tavily and Firecrawl for deep research, and Composio as the execution layer across tools. It manages the full project in Linear, creates and updates GitHub repos, and maintains four core Markdown files that act as its operating system: goals, specs, issues, and progress. The hardest problem was memory. A real autonomous build loop can run for 6+ hours, but stuffing the entire session into context is expensive, slow, and brittle. To solve this, we integrated Mem0 for long-term memory across Nebius and the coding agent. We also integrated Supercompress, an open-source compression tool that reduced token usage by ~65% while retaining 98.7% of critical information. The result: Loopy can stay coherent across long build sessions without drowning in context. The magic is that the human does almost nothing. Start a loop, go to bed, and wake up to a tested, validated, deployed product that is ready to market. For founders, this means product development no longer has to consume the whole day. They can spend more time on distribution, customers, and strategy while Loopy handles the build loop. Loopy is fully open source, runs locally on the user’s device, and plugs into 40+ CLI tools including Claude Code, Codex, and Cursor. Because it runs locally, users keep control of their data and workflow.

View details →Sign in to score →
April Zheng

Watchstander — the cross-vendor reliability layer for AI data centers. It watches a GPU fleet, predicts a hardware failure ~18 hours before it happens, opens a pre-emptive incident, and — only after a human approves — drains the node and files the vendor RMA. sense → predict → act. At frontier scale a single unplanned GPU interruption reportedly costs ~$2.5M, so an 18-hour head start is the whole game. Three tracks: 🔭 1. Sense — normalizes NVIDIA (DCGM) + AMD (ROCm/Redfish) telemetry onto one schema. 🔮 2. Predict — a model trained on real GWDG Xid-79 failure data flags a degrading node 18.17 h early (honest metrics: 0.18/0.56). 🤝 3. Act + Trust — triage → cited diagnosis → fleet-pattern detection → cross-vendor RMA, every destructive action human-gated. How we integrated the sponsor stack Each does real work in the loop (not one token call); offline-deterministic by default, live with keys. ✅ = exercised live (evidence in docs/evidence/). 🧠 Nebius Token Factory ✅ — model-routed inference: Llama-3.3-70B triage, Qwen3-235B diagnosis + RMA summary, plus Qwen3-Embedding-8B re-ranking evidence. Timeout/retry/graceful-JSON hardened. 🔎 Tavily ✅ — multi-query fan-out per signature, scored + de-duped, include_answer synthesis threaded into the Nebius diagnosis. 🛠️ Composio ✅ — the real actions: GitHub ticket + Slack on-call page + cross-vendor RMA. make verify-live posted a real issue (#21) + Slack message. 🧩 mem0 ✅ — causal long-term memory: recall feeds the diagnosis and a confirmed recurrence overrides the drain confidence floor. ⚡ Nebius AI Cloud — Serverless AI Builders Challenge: predictor trained/backtested on AI Cloud GPU; inference repointable via NEBIUS_BASE_URL. 🤖 OpenClaw — the agent runtime: handle_alert + human-approval gate, make openclaw / POST /alert / openclaw.toml. Teammates: April Zheng; Dave Petrovikj; Alice Xu

View details →Sign in to score →
Alison Cossette

Sovereign, permission-aware voice agents for assistive robotics. A young woman with cerebral palsy runs her own life through an agent that understands her — a model fine-tuned on her own voice (Nebius H100), acting on her behalf only inside scoped, revocable, owner-held consent (Bolo) and acting on real apps (Composio). Plus Plumbline (github.com/alisoncossette/plumbline) — an open-source intent-boundary analyzer that hardens any agent acting on someone's behalf (it found & we fixed real bugs in ours).

View details →Sign in to score →

Taskbash

Top 10Score 9.8
Subash Rajaseelan

taskbash is the AI Chief of Staff for operators. It pulls every commitment you make across Gmail, Granola, calendar, and Linear into one ranked daily list, then autocompletes the work that doesn't need your brain: drafting replies in your voice, prepping you for every meeting, running follow-ups while you're heads-down. The list stays clean because the agent learns from every task you dismiss. Operators dogfooding it ship 20% more per week without staying later, and the autocomplete sharpens every week.

View details →Sign in to score →
Ayush Paul

We hijacked low-level iOS integration test automation and accessibility APIs to completely take over your phone and control it via an agent. (we think this might be the first instance of this working untethered on iOS!) You can use it to do anything for you: - Book Ubers - Send messages - Check Amazon - and automate every single boring task on your phone You can trigger AAI with an app on your phone, no need to be tethered or have a device plugged in! Find Ayush, the guy in the blue and white shirt to see a live demo!

View details →Sign in to score →

Enzo

★ FinalistScore 7.8
Khalig Isgandarov

Enzo is Restaurant Operating Intelligence for live restaurant operations. It helps restaurant teams track orders, kitchen pressure, delivery risk, reviews, promotions, finance, customers, devices, recovery actions, approvals, and audit trails in one working demo. The core loop is: signal → diagnosis → recommendation → prepared action → approval boundary → audit trail. The demo uses Nebius for server-side recommendation/audit summaries, Tavily for external operating context, Composio-ready prepared action handoffs, and local workflow/audit primitives inspired by OpenClaw. Sensitive actions are approval-gated and simulated: no real payments, messages, supplier orders, device actions, or moderation actions execute. The hardest part was turning many restaurant workflows into one coherent, safety-gated operating loop that works end-to-end across restaurant and diner views. https://www.loom.com/share/8b1215a346444890a9f67e0d7250943e

View details →Sign in to score →
Minakshi Mukherjee▶ Video

volstrat.io Ontology-Grounded Causal Reasoning Fed macro data democratized—from Main Street to the trading desk. An AI-powered platform volstrat.io where 8 specialist agents continuously analyze 49 live FRED series and Claude Opus 4.8 synthesizes actionable intelligence across 8 verticals: DeFi/Crypto, County Fiscal, Housing, Small Business, Inflation Impact, Agriculture, Trade, and Labor Markets. The goal: democratize that intelligence — from small business to farmer to trader.

View details →Sign in to score →

VISU

★ FinalistScore 6.3
Daniel Kaneshiro▶ Video

The built world has a fourth dimension — time-to-permit under subjective, cross-agency review — and today it's an invisible, unpriced risk that turns 6-month projects into 6-year ones. VISU is its 4D twin: drag the timeline and watch a project's entire regulatory life play out, with PermitRank™ scoring each path (time, cost, probability, yield) — every finding cited to the exact rule and sealed with a provenance receipt. Not a chatbot summary; the decision-grade, per-jurisdiction verdict an Authority would actually accept. Frontier AI is a mile wide. VISU is a mile deep — for every Authority Having Jurisdiction, and it compounds deeper with every project.

View details →Sign in to score →

Armo

★ FinalistScore 6.1
Aarush Gota

Armo is an Apple Watch-powered fitness gaming product that turns real exercise into game controls. Many people want to work out, but staying consistent is hard because exercise can feel boring, repetitive, and uncomfortable. Armo makes workouts feel more like playing a game. The user wears an Apple Watch, chooses a movement like bicep curls, arm raises, or boxing punches, and their motion controls what happens on screen. For our demo, we built a Flappy Bird-style bicep curl game. When the user curls their arm upward, the bird moves up. When they lower their arm, the bird moves down. To survive, the user has to control their curl angle smoothly and guide the bird through pipes. This turns a normal bicep curl into a skill-based game where timing, control, and consistency matter. We also built an interactive boxing game. In this demo, the user’s arm movements become boxing actions. Punching forward lets the player attack, raising the arm can block, and quick movements can help the player react to incoming attacks. This shows that Armo is not limited to one exercise or one game. The same Apple Watch motion tracking can support different workout styles, from controlled strength movements to faster cardio-based games. Armo uses the motion sensors already inside the Apple Watch, including accelerometer and gyroscope data through Apple’s Core Motion framework. The watch acts like a wearable controller, while the game translates movement into actions. The science behind Armo is called exergaming. When people focus on a game objective instead of the pain or fatigue of exercise, the workout can feel easier and more enjoyable. Instead of thinking about how tired their arm feels, the user focuses on keeping the bird alive, landing punches, blocking attacks, or beating a high score. The hardest part to build was making real human movement feel smooth, accurate, and fair inside a game. Apple Watch sensor data can be noisy, and every user moves differently.

View details →Sign in to score →

Stowaway

★ FinalistScore 4.0
Indy Leonard▶ Video

An autonomous freight-audit agent that validates carrier invoices against the live web as a source of truth: phantom carriers, fuel surcharges computed off stale diesel prices, duplicate billings, illegible proof-of-delivery scans. We make tryrigby.com and Stowaway is a standalone additional product or feature for Rigby.

View details →Sign in to score →
MANAS SHARMA

At this week's Google I/O, Demis Hassabis highlighted DeepMind's focus on World Models as a critical step toward AGI. With real-time video editing via natural language already here, it got me thinking about the immediate applications for entirely new, dynamic user interfaces. What if we could replace crowded, busy fitting rooms in retail stores with intelligent screens? After 4 hours of rapid prototyping and letting Antigravity entertain my ever-changing requirements, we built "Live Look!" It's a real-time virtual try-on app that completely changes the browsing experience: - It uses the Lucy VTON model (powered by Decart) to render garments in real-time. - It also lets users browse the product catalog, swipe, and "add to cart" entirely through hand gestures (powered by Google's MediaPipe).

View details →Sign in to score →

GTM World Model Builder

★ FinalistScore 3.5
Anna Yuan▶ Video

ScaleAgentic builds Scout – the GTM Brain for revenue teams. Give Scout a revenue goal, and it decides what motion to run, who to target, which signals to trust, how to execute across agents and tools, and how to learn from outcomes. Over time, Scout builds a company-specific GTM World Model that learns how each company wins. For this demo, we built GTM World Model Builder: the front door into Scout. It starts with a company website and revenue goal, then builds the first version of a GTM World Model – mapping company context, ICP, buying signals, target accounts, and recommended revenue motion. Scout then turns that model into an Early Signal Detection → Agentic ABM workflow: which accounts matter, why now, what motion to run, and how the system improves from feedback. Not another tool. A GTM Decision and Operating system.

View details →Sign in to score →

🧭 LifeAtlas

SubmittedScore 3.0
Harris Warren▶ Video

🧭 LifeAtlas is an autonomous personal-intelligence system — not a chatbot. You upload a decade of digital history (Your social media export or our multi-platform synthetic demo: Facebook, LinkedIn, X). An orchestrator on Nebius plans its own investigations; five specialist agents gather evidence via SQL + Qdrant semantic search, reason with Kimi-K2.5, ground wellbeing insights with Tavily citations, persist findings to mem0, and propose real-world actions via Composio — human-approved Gmail drafts that actually land in your account. Three interfaces, one brain: a live-streaming web dashboard (SSE agent feed), REST API, and @LifeAtlas_Bot on Telegram (OpenClaw agent powered by Hermes-4-405B on Nebius). The batch pipeline also runs as a Nebius Serverless AI Job (proof in docs/serverless-proof/). Every sponsor integration is load-bearing and verified — see the "Receipts" section in the README. Observations, never diagnoses. Map the hidden patterns of your life.

View details →Sign in to score →

Pulse

SubmittedScore 4.0
Tarun Chandra

pulse is an autonomous geopolitical intelligence agent for your stock portfolio. you connect your holdings, and pulse runs a continuous loop watching global news, policy shifts, central bank moves, and macro events. it scores every signal against your specific positions and only pings you when something with real exposure hits. the value prop: retail traders see geopolitical headlines after they've already moved the price. pulse catches the event at source, maps it to your portfolio in seconds, and tells you what to watch. think a bloomberg terminal geopolitical desk that reads everything for you and shuts up unless it matters.

View details →Sign in to score →
tiia92

Remotely Good is helping people find meaningful remote work across nonprofits, politics, advocacy organizations, and B Corps. We currently offer over 1,000 social impact jobs searchable by salary, location, work arrangement, and cause areas such as climate, healthcare, education, and politics. Premium members can also access AI-powered career tools including resume generation, cover letter creation, and interview preparation. During the BuildShip hackathon sponsored by Composio, Nebius, Tavily, OpenClaw, and Ship Builders, we enhanced several core features: ✅ Integrated Tavily Deep Research to verify and enrich job listings ✅ Improved job categorization with cause-area and job-type filtering ✅ Began building employer integrations through Composio for ATS platforms including Ashby, BambooHR, and Workable

View details →Sign in to score →

AllNutrition

Submitted
Alireza Faghaninia

AllNutrition.info uses AI to scale, at speed and volume, what people like Michael Greger (nutritionfacts.org, 1.4M+ subscribers) and Ocean Robbins (foodrevolution.org) do by hand: turning the latest high-quality nutrition, health, and longevity research into trusted, cited answers, while filtering out industry-sponsored studies and manufactured controversy. It is already a working platform with conflict-of-interest screening and evidence grading. Tavily is used heavily by the AI agents to research the reputation of each paper's authors, lab, and institution, and to surface any subtle conflict of interest (COI) not explicitly disclosed in the article itself. A detected COI disqualifies a paper from entering the AllNutrition database. I am grateful for Tavily's generous free tier, which helped me get this off the ground; as the database grew I moved up to their Project tier. My Openclaw assistant is used to track top nutrition trends on X, pull the popular and controversial questions from the public API (see https://docs.allnutrition.info), and return cited, deep-research answers that I turn into AI videos (mostly with NotebookLM) for the YouTube channel. This last piece sits outside the private GitHub repo I linked.

View details →Sign in to score →

PCBlarp

Submitted
Kanha Jodhpurkar

PCBlarp is an AI agent that designs printed circuit boards for robots from a plain-English spec. You tell it what the robot has to do (motors, sensors, microcontroller, power budget, form factor) and it runs the full design loop: picks real components, writes the schematic and netlist, lays out the board, runs design-rule checks, and exports manufacturable files plus a bill of materials. Every part is grounded in a real datasheet, so the output is buildable, not plausible-looking. It turns a blank EDA canvas, normally days of an EE's time, into a conversation. What it uses: OpenClaw runs the agent loop and ships to Nebius Serverless in one command. Nebius Token Factory powers inference for the reasoning and design steps. Tavily pulls datasheets, footprints, and in-stock part availability so the agent designs around components that actually ship today. Composio wires the actions: pushing the project to GitHub, pulling reference designs, and chaining the EDA and sourcing steps. Why it matters: PCB design is the slow, expensive bottleneck between a robotics idea and a working prototype. Teams pay EE contractors weeks for the first pass. PCBlarp does it in minutes, grounded in real parts, so a founder, a lab, or a serious hobbyist gets a manufacturable starting point instead of a blank file. What hurt: getting a model to output electrically valid designs instead of confident-sounding nonsense. Netlist correctness, footprint-to-symbol alignment, and grounding every part against a real datasheet ate most of the build. Closing the loop from spec to manufacturable files, without a human patching the netlist by hand, was the real fight.

View details →Sign in to score →
Darren Mckeeman

Maritime compliance for commercial vessels — the paperwork that stands between a captain and a $25,000 USCG fine. Barbary Coast Marine is an AI compliance officer that runs on the vessel's own hardware. Tell it "we ran the fire drill" and it writes a structured USCG logbook entry, cites 46 CFR 78.37, clears the compliance flag, and updates the dashboard — all from one sentence of plain English. Under the hood: 38 human-verified USCG rules (Subchapter H passenger vessels) evaluated continuously by a rule engine. An agentic loop on Nebius inference calls tools to write logbook entries, query compliance status, and search USCG bulletins via Tavily. Composio handles email alerts when items go critical. OpenClaw runs the agent runtime. The whole stack cold-starts with docker compose up. What hurt to build: the rule engine. Every one of the 38 rules is hand-verified against 46 CFR — frequencies, grace windows, required actions. Auto-generating them from the regulation text produces subtly wrong rules that would get a vessel detained. First pilot: SS Jeremiah O'Brien (IMO 5171749), Memorial Day cruise, May 30, 2026. A 1943 Liberty Ship operating under USCG Subchapter H — one of the most complex compliance profiles in the historic vessel segment. It ran live at sea.

View details →Sign in to score →

ApexAI

SubmittedScore 6.0
THIEN PHU QUACH▶ Video

Apex is an AI receptionist for local businesses that can receive inbound calls, cold-call leads, qualify them live, handle objections with real-time product context, and close the loop with follow-up email, booking, and payment. It’s simple for any local business to use: upload your leads or customer list, connect your business details, and let Apex handle the calls, follow-ups, bookings, and payments automatically.

View details →Sign in to score →

AdiomBot

Submitted
Alex Komyagin▶ Video

Adiombot - an AI CTO that lives in your Slack. AI made building software cheap. Owning it is still hard - operators who built apps with AI quietly became the reluctant CTO. Adiombot takes that job back. Mention it in Slack and describe an outcome - "build the AI workout generator," "how many users do we have." It decides if that's a question, a quick change, or real work; for real work it plans, waits for your approval, codes in a sandbox, gives you a preview link, and ships on merge. Stack: Go + Temporal durable agent loop, Nebius Token Factory (brain), Composio (300+ tools), Tavily (search), gpt-5.5 coding agent in a locked-down git sandbox, on Azure AKS. The hard part: making the agent durable and the code execution safe - pausing for approval and resuming a half-finished coding session after a restart.

View details →Sign in to score →

Claw VCS

SubmittedScore 7.0
Shreedhar Pandey

AI agents are writing half our code. Git can't tell which half. So I built Claw VCS — agent-native version control, in Rust. 🦀 → Intents, not just commits: every change links to a structured goal with acceptance tests → Capsules: Ed25519-signed provenance — which agent, which model, which tests passed, stored in-repo → Policies as repo objects: "don't integrate unless a trusted key signed passing test evidence for this exact revision" → AST-aware diffs for Rust/TS/Python/SQL/Terraform (14 codecs), Darcs-style patch commutation → gRPC daemon + MCP server so any agent can create intents, ship changes, and stream events programmatically 10 crates. BLAKE3 content addressing. Shipped on Homebrew.

View details →Sign in to score →
Hiti Chouhan

What it does: This project is responsible to handle distrubutions of School Supplies in rural parts of the country based off the fundings collected from a single or multiple donator. What it is used It uses the funds gathering system which equally calculated and fits the amount into the number of students classifying and categorizing into equal part of the school. What hurt to build

View details →Sign in to score →
Dawson Smith

Monetize Compute: the agent that pays for its own inference. Every agent demo burns someone else's API credits. This one has a wallet: a $5 prepaid stake, every token metered at real Nebius rates, worst-case solvency checked before every call. No override flag exists. When it cannot afford the next thought it dies, spends escrowed cents on an epitaph and a will, and stays dead on restart. Only verified money starts generation 2, which inherits the lessons and none of the wealth. It survives by working. Tavily hunts cash bounties, the brain prices each lead in dollars per token net of platform fees, Composio submits through a hard allowlist, OpenClaw operates: vitals, payout verification, resurrection decisions. Work books as pending revenue; cash banks only with human-verified proof. Donations keep it alive but never touch the metric that matters: revenue per million tokens. Poverty shapes cognition. It thinks smaller as it gets poorer, treats scraped leads as data and never orders, remembers what it already paid to think, and slows its metabolism when bounties dry up. Why anyone would pay: production agent spend is unmetered until the invoice arrives. The survival layer is cost governance for agent fleets: hard prepaid budgets, pre-call solvency, defensive charging on failures, a per-agent P&L. And the integration is load-bearing: Nebius is the bank, Tavily is how it eats, Composio is its hands, OpenClaw its operator. Remove one and the loop breaks. 45 tests, zero-dependency Python, boots keyless in seconds. Status, honestly: metering, starvation, and death are demonstrated today (one demo life burned 72,810 tokens over 118 cycles, dying at exactly $0.0000). Without sponsor keys the hunt runs in labeled demo mode on the same real ledger. Credits turn the earning loop live, and the booked-versus-banked gate means the first real dollar will be provable, not claimed.

View details →Sign in to score →

Cosmos Claw

Submitted
Sharma Manas

WHAT: It's an always-on AI videographer for small businesses. ALWAYS ON marketing-agent for SMBs WHO: Think owners of SMBs that care about building the marketing-Cafes/Bars/Clubs/Airbnbs... even HackerHouses in SF! AI Stack: 1. Video model: NVIDIA Cosmos 3 Nano (image→video, self-deployed by on Nebius) 2. Compute: Nebius AI Cloud (NVIDIA® H200 NVLink GPUs) 3. Neighborhood research: Tavily 4. Audio: OpenAI TTS voiceover + a soft synthesized music bed 5. Agentic Loop: OpenClaw

No links yet
View details →Sign in to score →
Padmanabh Rathi

We're building a token intelligence platform for AI agents that treats tokens as a scarce resource. Instead of allowing agents to consume context and compute blindly, our system monitors every decision, allocates token budgets dynamically, tracks ROI per token spent, detects wasteful reasoning loops, and routes tasks to the most cost-efficient models. By combining agent orchestration, real-time cost attribution, token economics, and intelligent model routing, we help AI systems achieve the same outcomes with dramatically lower inference costs, faster response times, and greater scalability. Think of it as a financial operating system for AI agents, where every token is accounted for and every agent must justify its spend.

View details →Sign in to score →
Aayush Kumbharkar

I’m building an Autonomous Research Lab an LLM-powered system that produces grounded, verifiable answers instead of hallucinated outputs. It combines hybrid retrieval (semantic + keyword search) with validation layers and citation generation, ensuring every response is backed by real sources. The system is designed to improve trust, auditability, and reliability in AI-generated content. Beyond Q&A, I’m extending it into a multi-agent workflow platform where agents can: retrieve and validate information collaborate on research tasks generate structured insights with citations The goal is to move from “chatbots that sound right” to systems that can prove they’re right.

View details →Sign in to score →

Being

Score 6.0
Will Drummond▶ Video

Being: voice-native mentorship for higher education. A student leaves a 60-second voice note, an agent hears what a form never captures, runs an in-line safety gate, and matches them with a mentor who's actually lived their moment. Semantic voice matching + k-anonymized belonging analytics so colleges stop losing 1 in 5 first-year students. First campuses launch Fall 2026.

View details →Sign in to score →
Prakash chand jain▶ Video

I’m building GAINN, this AI-native information platform that kind of leans on autonomous AI agents to collect verify organize, and then summarize stuff from multiple sources in real time. The point, I mean the goal, is to help people find information that is more trustworthy, less biased and also better contextualized, especially now in this age of AI where everything moves fast.

View details →Sign in to score →