Maven Editorial Intelligence

AI-assisted proofreading for pharma promotional manuscripts. Four parallel AI checks against the brand's rules and references, output as an annotated PDF and a findings dashboard.

Client project·Maven Communications·2026

Screenshots are excluded from this case study. Every screen in the product shows active pharma brand identifiers, prescribing-information references, or editorial workflows that are under NDA. Happy to walk through the product live.

Overview

Maven Editorial Intelligence (MEI) is an AI proofreading platform I built for Maven Communications, a US promotional medical communications agency. Editors drop a manuscript in, four AI modules run in parallel against the brand's rules and reference documents, and they get back an annotated PDF plus a findings dashboard they can triage in. I started the build on April 23 and shipped a multi-tenant MVP into pilot in five weeks. 108 PRs merged, 49 migrations, 106 test files. Stack is Next.js 16, Supabase, OpenAI, pdf-lib, deployed on Vercel and Fly.io.

The challenge

Promotional manuscripts in pharma go through med-legal review (MLR). Three reviewers (medical, legal, regulatory) read every page, check every claim against the prescribing information, verify that required regulatory elements are present, and mark up the PDF by hand. A 12-page sales aid can take a senior reviewer two to three hours. Christopher, who runs Maven Communications, wanted to compress the first pass without changing the artifact reviewers work in.

The constraint was real. Med-legal reviewers live inside Adobe Acrobat. Whatever the AI did had to come back as a PDF they could mark up the same way they always do.

The reframe

It is not a chat tool with a PDF export. It is an MLR reviewer that hands you a marked-up PDF before you start.

Once I framed it that way, the output became the product. Findings render as PDF 1.7 highlights and sticky-note popups, the same shape med-legal reviewers already mark up by hand. Each finding has the verbatim quote, the rule it violates, a suggested fix, a severity, and (for claims) the reference that supports or contradicts it. The dashboard is for triaging and re-running. The file the reviewer opens in Acrobat is the deliverable.

Key decisions

Four modules in parallel, not one mega-prompt

Pharma MLR is not one job. It is four distinct ones: Grammar (yellow), Brand (orange), Claims (red), Annotations (blue). Each module gets its own JSON schema, its own prompt, and its own color in the PDF. Tradeoff: four prompts and four schemas to maintain instead of one. Win: a brand-rule violation never gets muddled with a missing Important Safety Information box, and reviewers can filter findings by module on the dashboard.

pgvector retrieval over context stuffing

Early versions stuffed the brand's full reference library into every prompt. It blew up on a 957k-token blob and was expensive at any reasonable volume. I migrated retrieval to pgvector, chunked references at ingest, and made every module's prompt pull only the relevant chunks. Tradeoff: a real retrieval layer with its own evaluation harness. Win: predictable costs, and a reference library that scales with the brand, not with the prompt budget.

Annotated PDF as the artifact, not the screen

The job page has a Grammarly-style workspace with the manuscript on the left and findings on the right. But the file the reviewer downloads is the actual product. Annotations follow the open PDF 1.7 spec, so highlights and popup comments survive in Acrobat, Preview, Foxit, anything. Tradeoff: pdf-lib mechanics, font fallbacks, and a watermark guard so cross-brand finding IDs never collide. Win: reviewers got a file that fit their existing workflow on day one.

Rejection memory that survives re-runs

Reviewers reject findings they disagree with. I built an addressed-PDF parser that reads the Accepted/Rejected state from a prior MEI run, fingerprints each rejected finding, and feeds those hints to the LLM on the next review. False positives shrink over time without retraining a model. Tradeoff: a fingerprint design anchored to span and rule (not LLM output text), plus a suppression audit log. Win: a feedback loop that doesn't depend on labelled training data.

Smartsheet at the brand layer, not in a sidebar

Maven's editorial pipeline runs on Smartsheet. I made the integration a brand-create requirement: every brand has a Smartsheet Sheet ID, and every completed review posts a row back. Tradeoff: a half-broken state if the integration were optional, so I made it mandatory at the brand level. Win: reviews land where Maven's editors already track work. No second system to check.

Multi-tenant from day one

Every public table has Postgres row-level security. A user only sees brands they belong to via client_members. Platform admins are a separate flag, not a permission level. Tradeoff: extra RLS work on every migration. Win: the moment Maven onboards a second pharma client, the isolation is already there. No “we'll add it later”.

Impact

Shipped a multi-tenant MVP in five weeks (Apr 23 to May 26, 2026).

108 PRs merged across 129 opened. Each major surface gated behind a verified review.

Around 56,000 lines of TS/TSX/SQL, 49 migrations, 106 test files.

Four AI modules running in parallel against brand-scoped rules and pgvector-backed references.

In pilot with Maven Communications on real promotional manuscripts.

Rejection memory loop closes the gap on false positives without model retraining.

Smartsheet integration writes every completed review back to Maven's editorial pipeline.

Timeline

W1Six-wave MVP. Upload, pgmq queue, four AI passes, annotated PDF, findings dashboard.

W2Drop Adobe and S3 from the spec. Switch to pdf-lib and Supabase Storage. Rules editor, file picker, tester activity script.

W3Rebrand from Maven Review to Maven Editorial Intelligence. Sidebar slim-down, Back Office for platform references, audience-specific guidance.

W4Phase B AMA hints wired into all four passes. Eval harness, rejection-hint prompt block, rejection-memory persistence.

W5Phase E OPDP letter ingestion. pgvector RAG replaces reference-blob stuffing. Smartsheet integration end-to-end. Edge Functions migration. In-app Approve / Reject with on-demand PDF restamp.

Reflection

The hardest call on this build was treating the PDF as the product. The temptation was a chat-style review tool with a file export, but med-legal reviewers don't want a new interface. They want their existing one to be 60% lighter. Building toward an annotated PDF (not toward a screen) is what made this adoptable.

If I were starting over, I'd build retrieval before the first prompt, not after a 957k-token outage. Every reference-heavy LLM product I build from now on starts with pgvector on day one.

Tech Stack

Next.js 16React 19TypeScriptTailwind v4SupabasePostgreSQLpgvectorpgmqOpenAIpdf-libunpdfmammothxlsxResendSentrySmartsheet APIVercelFly.io

Back home