Project-Level and Bid-Package (Folder) Workflow
Product context: Estym8 is built from scratch as an AI-first construction preconstruction platform—not a legacy takeoff stack with AI bolted on. AI runs across the product: bid-package ingestion and classification, multi-model takeoff and vision, plan intelligence, cross-file synthesis, Estee, estimate-to-submittal draft review, and optimization recommendations. Canonical framing: AI-first positioning.
Goal: Keep fast drawing-PDF uploads (full multi-discipline takeoff or optional symbols-only CV path on the same project) and support whole bid packages: a GC, builder, or estimator uploads a folder of files for a job—drawing PDFs plus spreadsheets, schedules, text, JSON, and common images. The system classifies each item, builds a run plan for sheets that qualify for takeoff (including multi-file discipline bundles), persists an AI project overview (game plan, estimate strategy, file roles), documents the plan on the project, and executes it.
This workflow is part of an AI-first product built from scratch—classification, planning, takeoff, and cross-file intelligence are model-driven stages on one spine, not manual steps with optional AI assists. See AI-first positioning.
1. Current workflows (three live paths)
- Single drawing PDF (full takeoff) — Create or open a project → Upload plan PDF → one job → one estimate per file. Fast default for a full multi-discipline takeoff on that drawing.
- Single drawing PDF — symbols-only (CV) — Same project → optional symbols-only takeoff (lighter pipeline: OpenAI + template counting,
SymbolTakeoffrow, blob PDF). Review drawing + detection boxes in-app, then Promote to estimate when you want a normal estimate row. Does not replace folder bundles; plan-gated like other takeoff entry points. - Bid-package folder — Same project → Upload folder (plans & docs) → every accepted file is stored with MIME + kind (plan PDF vs spreadsheet vs text vs image, etc.) → analyze → plan → optional full pipeline (execute takeoffs) → AI project overview on the project when the configured estimate model API is available (commonly
GROK_API_KEYwhere Grok is used for those jobs).
A project may have many estimates (multiple single uploads, folder runs, rebids, versions).
1b. Expanded vision (ongoing)
Beyond "classify each file and group runs," the target experience for a construction company uploading a whole project folder includes:
- Per-file analysis: Discipline, role, skip vs process, quality (for drawing PDFs via the lightweight classifier; reference files via snippet + rules; layered "obvious non-plan" detector for RFPs, geotech reports, safety conditions, etc., so they aren't routed into MEP takeoff).
- Per-document verbatim harvest (shipped): Every drawing PDF kept by the classifier runs a single focused "As Printed on the Sheets" pass — title block, applicable code editions, exception clauses, printed allowances, calculation rules, and cross-sheet references — stored on the estimate as
documentTextHarvestso every claim traces back to a sheet. - Project-level synthesis (shipped, iterative):
intelligenceJsononProjectFolderRun— executive summary, inferred project type, discipline coverage, likely document gaps, game plan phases, estimate strategy (including narrative hooks for concrete, earthwork, etc., without inventing quantities), per-file roles, and risk/exclusion notes. Produced after analyze/plan; refined as prompts and inputs improve. - Cross-file intelligence pass (shipped): After the folder run completes, a post-folder cross-file intelligence pass compares every bundle's harvested digest and emits consistency findings, coordination topics, and RFI seeds across disciplines. Stored on
ProjectFolderRun.crossFileIntelligenceJson, validated againstCrossFileIntelligenceSchema. Distinct fromintelligenceJson(which is per-folder synthesis) and from the per-document harvest. Implemented inlib/folder-workflow/cross-file-intelligence.ts. - Code-edition mismatch alerts (shipped): When sheets disagree on code edition (e.g. one sheet cites NEC 2017 and another cites NEC 2020, or NFPA / IBC editions disagree), the project page surfaces the conflict so the user can RFI before bid day.
- Code and locale context (planned depth): Use jurisdiction / locale to inform code-aware notes—aligned with today's locale-aware MEP behavior, extended as disciplines grow (see product roadmap —
ROADMAP_BEYOND_MEP.mdin repo). - 6+ disciplines on one folder pipeline (shipped): Same classify → plan → execute loop covers electrical, mechanical, plumbing, civil, architectural, structural and adjacent trades. Coverage and depth vary by sheet quality and trade. Additional vertical depth (concrete, air balance, HVAC recommendations, etc.) is on the product roadmap—not a separate product line.
- Multi-discipline narrative report (shipped): Single downloadable Markdown / DOCX / PDF report that stitches the verbatim plan harvest, every completed takeoff, conflict log, draft RFIs, and bid-bucket reconciliation vs engineer-printed grand totals into one decision-ready package.
Pool builders and in-ground work: Many pool jobs are site and utility–centric. Where PDFs include legends, schedules, and quantifiable scope, the same PDF → takeoff approach applies; accuracy depends on sheet quality and whether scope matches our MEP-forward extraction today.
2. Folder workflow: analyze → plan → document → display → execute
A folder will often contain files that are not full drawing sets (spec excerpts, pricing spreadsheets, photos). We do not send every file through symbol takeoff. Instead:
2.1 Analyze each file
- Input: Folder of files (or a
.ziparchive at the root of the picker): PDF plans, Excel/CSV, Word (.docx), Outlook email (.msg/.eml), text/markdown/JSON, images (seeFOLDER_UPLOAD_EXTENSIONSin code)..zipfiles are expanded before ingest (path-safe; capped uncompressed size per archive). Max file count:MAX_FOLDER_FILES(500) after expansion. Each file storesrelativePath(nested location within the upload tree) plusoriginalName(leaf filename). - Analyze all, process selectively: Every accepted file is classified and non-plan assets get bounded text snippets (including
.docx/.xlsx/.msg/.eml). PDFs skipped for takeoff (specs, RFIs, deduped combined sets, etc.) still get full-document text extraction (fullTextSnippet) plus optional supporting-file LLM insight — they are excluded from Grok takeoff jobs only. Plan PDFs withprocessRecommendation !== skiprun takeoff jobs. Combined vs separated drawing sets: when a substantial Separated/ per-sheet set exists, the combined Arch/Struc/MEP PDF is still fully read for context but skipped for takeoff to avoid duplicate counts (lib/folder-workflow/dedupe-drawing-sets.ts). Path hints (specs/,quotes/,Geotech/,FP Plans/, …) steer classification before takeoff planning. - Step:
- Plan PDFs: Lightweight PDF classification (discipline, role, skip vs process).
- Non-plan assets: Fetch blob → bounded snippet (e.g. CSV/XLSX/text; image placeholder) → reference classification (
processRecommendation: 'skip'for takeoff; intelligence-only).
- Output: Per-file classification stored on
FolderRunFilewithfileKindandmimeType.
2.2 Create a plan of action
- Skip: Reference-only or irrelevant files.
- Process singly: One drawing PDF → one job → one estimate.
- Process as group: Multiple files same discipline → one bundled takeoff when the plan says so.
2.3 Document the plan (per project)
ProjectFolderRun: Folder name, status,planJson,intelligenceJson, pipeline job counters,FolderRunFilerows (name, URL, classification, kind).
2.4 Display in the UI
- Project page: Folder run card — file count, per-action outcomes (success / failed / skipped / pending), AI project overview when present.
2.5 Execute the plan
- Run actions enqueue takeoff jobs (e.g. Grok-backed
GrokJobwhere configured); results link to estimates; UI shows progress.
2.5b Symbols-only takeoff (single-PDF path, not folder plan actions)
- When: User runs symbols-only takeoff from the project UI (
POST /api/projects/[projectId]/symbol-takeoff). - What: CV/template pipeline →
SymbolTakeoff+ optional PDF overlay review → promote →Estimate. Same project as folder work; not aplanJsonaction. - Code:
lib/symbols/run-and-persist.ts,app/api/projects/[projectId]/symbol-takeoff/*,app/projects/[projectId]/symbol-takeoff/[takeoffId]/page.tsx.
2.6 AI project overview (per-folder intelligence)
- When: After analyze + plan (and after full-pipeline analyze step). Requires the configured estimate model client; failures are non-fatal for the rest of the pipeline.
- What: Structured JSON validated against
ProjectIntelligenceSchema— executive summary, file roles, phased game plan, estimate strategy (quantities to develop, bid structure suggestions, concrete/foundations and earthwork narratives, risks/exclusions). Not a substitute for measured takeoff where we did not run counts. - Stored at:
ProjectFolderRun.intelligenceJson.
2.7 Verbatim "As Printed on the Sheets" harvest (per-document)
- When: During the takeoff job for each drawing PDF kept by the classifier.
- What: A single focused pass that lifts title block, applicable code editions, exception clauses, printed allowances, calculation rules, and cross-sheet references verbatim from each PDF. The harvest also populates per-sheet general notes and keynotes when present.
- Stored at:
EstimateOutput.documentTextHarvest(and consumed by the narrative report and Plan Review tab). - Why it matters: Every claim shown to the user can be traced back to a specific sheet, not to model inference. Powers code-edition mismatch alerts, exception/allowance summaries, and the harvest section of the multi-discipline narrative report.
2.8 Cross-file intelligence (post-folder pass)
- When: After the folder run completes (every eligible takeoff has at least attempted to run and the per-document harvests are persisted).
- What: A pass over every bundle's harvested digest that emits consistency findings (e.g. schedule vs plan conflicts), coordination topics (cross-discipline issues that need an early conversation), and RFI seeds (draft questions that the user can promote into a tracked RFI). Validated against
CrossFileIntelligenceSchema; over-length string fields are silently truncated rather than rejected. - Stored at:
ProjectFolderRun.crossFileIntelligenceJson. - Distinct from:
intelligenceJson(which is the per-folder AI project overview produced earlier in the pipeline) anddocumentTextHarvest(which is per-document and verbatim).
2.9 Code-edition mismatch alerts
- When: Surfaced on the project page once at least one harvest contains a
codeEditionvalue. - What: When sheets in the same project cite different code editions (e.g. NEC 2017 vs NEC 2020, or NFPA / IBC editions disagree), the conflict is flagged so the user can RFI before bid day.
- Driven by:
documentTextHarvest.codeEditionrolled up across every estimate on the project.
2.10 Multi-discipline narrative report
- When: Available to download from the project header any time at least one COMPLETED estimate has MEP / fire-protection device counts.
- What: Single downloadable report (Markdown / DOCX / PDF) that stitches the verbatim As Printed on the Sheets harvest, every completed takeoff (per discipline, deduplicated), the conflict log, draft RFIs, and bid-bucket reconciliation that compares AI counts against engineer-printed grand totals from the schedules.
- Endpoint:
GET /api/projects/[projectId]/report/export?format=docx|pdf|md(defaultdocx). - Implementation:
lib/reports/narrative-report.ts(markdown rendering) →lib/reports/markdown-to-docx.ts/lib/reports/markdown-to-pdf.ts(format conversion).
3. Multiple files per discipline
- Goal: One logical electrical (or mechanical, etc.) takeoff spanning several PDFs.
- Implementation: Pipeline accepts multiple PDF URLs for one job for bundle actions; plan and UI show discipline (N files) → 1 run.
4. Data model (implemented shape)
ProjectFolderRun:planJson,intelligenceJson(per-folder AI project overview),crossFileIntelligenceJson(post-folder cross-file pass), pipeline fields, status.FolderRunFile:originalName,pdfUrl(blob URL for any type),mimeType,fileKind(FolderRunFileKind),classification, analysis status.Estimate.rawOutput.documentTextHarvest: Per-document verbatim harvest (title block, code editions, exceptions, allowances, calculation rules, cross-sheet refs, per-sheet general notes / keynotes).Project,Estimate,GrokJob— linked from plan actions and folder metadata.
5. Implementation status (summary)
| Phase | Status |
|---|---|
| Upload folder (multi-type) + store kinds | Shipped |
| Analyze + classify (PDF + reference) | Shipped |
| Layered "obvious non-plan" detector (RFP, geotech, safety, etc.) | Shipped |
| Build & persist plan | Shipped |
| Project UI: folder run + intelligence | Shipped |
| Execute plan (single + group) | Shipped |
| 6+ disciplines on one folder pipeline (electrical, mechanical, plumbing, civil, architectural, structural) | Shipped |
| Per-document verbatim harvest ("As Printed on the Sheets") | Shipped |
Per-folder AI project overview (intelligenceJson) | Shipped (env-dependent) |
Post-folder cross-file intelligence (crossFileIntelligenceJson) | Shipped |
| Code-edition mismatch alerts on the project page | Shipped |
| Multi-discipline narrative report (Markdown / DOCX / PDF) | Shipped |
| Bid-bucket reconciliation (AI vs engineer-printed totals) in the narrative report | Shipped |
| Single-PDF symbols-only takeoff + PDF review + promote to estimate | Shipped |
| User-editable plan overrides | Future |
| Deeper cross-document synthesis (schedules vs plans, specs vs sheets) — going deeper | Future |
6. Out of scope / future
- Auto-execute without review (optional product choice later).
- Re-run only failed actions — nice-to-have.
- Beyond-MEP structured verticals — definitions in takeoff verticals; narrative + sequencing in product roadmap.
6. Electrical image-tail governance (all plan sets)
When IMAGE_TAIL_POWER_PASS is enabled (default) on an electrical or unknown-discipline folder bundle, every plan set receives the same pipeline — not job-specific code paths:
| Stage | Module | Applies to |
|---|---|---|
| Vector power-plan scope | lib/takeoff/electrical-power-plan-scope.ts | Full permit sets (E3.x / site power pages) |
| Plan-area vision + tiled sweep | lib/takeoff/image-tail-plan-area-vision.ts | All scoped power-plan pages |
| Dense glyph expand | lib/takeoff/image-tail-power-device-dense-expand.ts | UB/J-box always expand (never blocked by partial plan-area counts) |
| Text-layer J/UB floors | lib/takeoff/power-plan-text-layer-counts.ts | After dense expand |
| Schedule cap/floor | lib/takeoff/power-device-schedule-validation.ts | Trusted PDF schedule + optional human BOQ fixture |
Human BOQ schedule fixtures (__tests__/fixtures/ground-truth/schedules/<testCaseId>.json) are per-project validation data — loaded when the project name matches lib/jonas/load-receptacle-schedule-reference.ts. They floor under-counts (e.g. UB/J-box on permit sets that omit interior modular sheets) but do not change which pipeline stages run. Add a fixture + name pattern when onboarding a new GC compare target.
Modular ASMEP jobs (detectModularInteriorPowerTakeoff) additionally strip Step 2 interior power rows and treat image-tail as authoritative — label must contain ASMEP or set MODULAR_INTERIOR_POWER_TAKEOFF=1.
7. Related code
lib/folder-workflow/*— ingest, classify (including the layered non-plan detector inclassify-folder-pdf.ts), plan, execute, project intelligence (project-intelligence.ts), and post-folder cross-file intelligence (cross-file-intelligence.ts+cross-file-intelligence-schema.ts).lib/llm/document-text-harvest.ts— per-document verbatim "As Printed on the Sheets" harvest pass.lib/reports/narrative-report.ts+markdown-to-docx.ts+markdown-to-pdf.ts— multi-discipline narrative report (Markdown / DOCX / PDF) with bid-bucket reconciliation.app/api/projects/[projectId]/folder-runs/route.ts— multipart + JSON create.app/api/projects/[projectId]/report/export/route.ts— multi-discipline narrative report export endpoint.app/api/projects/[projectId]/symbol-takeoff/route.ts— symbols-only takeoff create + list.app/api/projects/[projectId]/symbol-takeoff/[takeoffId]/route.ts— takeoff detail (fullresultJSON for review UI).app/api/projects/[projectId]/symbol-takeoff/[takeoffId]/promote-to-estimate/route.ts— promote toEstimate.components/projects/symbol-takeoff-pdf-viewer.tsx— in-app PDF + bbox overlay (pdf.js).