This commit is contained in:
RogerHsu7 2026-01-30 11:55:33 +00:00 committed by GitHub
commit 8258fab447
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
5 changed files with 315 additions and 0 deletions

View File

@ -0,0 +1,32 @@
Title: Gateway Web Chat: add image/file upload (attach, drag-drop, paste)
Summary
- Adds attachment support in Gateway web Chat: attach button, drag & drop, and paste-from-clipboard for images.
- Backend: multipart upload endpoint that stores files under the instance media dir and returns MEDIA:/ paths; message API extended to accept media[].
- Frontend: composer queue with file chips, upload progress; message renderer for thumbnails and file tiles.
Motivation
Users of the web console need to share screenshots/logs directly in the browser. Today there is no upload mechanism.
Implementation (draft)
- Backend
- POST /api/media/upload (multipart, repeatable file field). Saves to ~/.clawdbot/media/inbound/<uuid>.<ext>, returns { files: [{ path: 'MEDIA:/...', mime, name, size }] }.
- Extend message creation to accept media[] in addition to text.
- Frontend
- Attach button + hidden input[type=file] multiple.
- Drag & drop overlay and handlers on composer.
- Paste handler captures image blobs from clipboard.
- Attachment queue with progress; after upload, send text + media paths.
- Message renderer displays thumbnails for images and links for other files.
Config & Limits
- Defaults: 25MB per file; up to 5 files per message. Configurable via gateway config.
Security
- Sanitized filenames; type/size validation; stored under instance media path; served via existing media route (auth/signed URLs as applicable).
Acceptance Criteria
- Drag/paste/attach works; messages show thumbnails or file tiles; assistant receives MEDIA:/path.
Notes
This PR ships as a draft to align on API shape and UI placement. Happy to adjust to the codebase conventions and routing.

View File

@ -0,0 +1,89 @@
# Gateway Web Chat: Image/File Upload
Status: draft
Owner: @assistant
Scope: Gateway web console (Chat)
## Problem
The Gateway web Chat has no way to send images/files. Users often need to share screenshots/logs during troubleshooting.
## Goals
- Allow users to attach files via: attach button, drag & drop, paste-from-clipboard (images).
- Show selected files as removable chips with upload progress.
- On send, upload files to server and insert a message with media references so assistants receive MEDIA:/path.
- Support multiple files per message.
- Reasonable defaults for limits (2050MB per item) and allow config override.
## Non-goals
- Inline buttons/interactive UI in messages (not supported on webchat).
- Server-side virus scanning.
## UX
- Add "Attach" icon next to the composer.
- Dragging files over the composer shows a drop overlay.
- Pasting an image inserts it as a pending attachment.
- Each selected file shows: name/size, thumbnail for images, remove (×), progress bar while uploading.
- Pressing Enter (without Shift) sends text + queued attachments.
## API
- POST /api/media/upload (multipart/form-data)
- fields: file (repeatable), caption (optional, per-file or per-message caption TBD)
- returns 200 JSON: { files: [ { path: "MEDIA:/path", mime: "image/png", name: "...", size: 12345, width, height } ] }
- saves files under gateway media path (e.g., ~/.clawdbot/media/inbound/<uuid>.<ext>)
- POST /api/sessions/:sessionId/messages
- body: { text?: string, media?: [ { path: "MEDIA:/...", caption?: string } ] }
- server stores transcript and delivers to assistant runtime.
Notes:
- If a single endpoint already exists to send messages, extend it to accept media[]. Otherwise, use two-phase (upload -> send).
- Respect existing auth/session; CSRF as per current gateway conventions.
## Backend
- Add multipart handling (e.g., busboy/multer/fastify-multipart depending on stack).
- Validate content type and size; enforce configurable limits.
- Compute a safe filename; store to media dir; produce absolute server path and MEDIA:/ prefix for assistant.
- For images: attempt to read dimensions (optional) to improve thumbnails.
- Return JSON list of stored files.
- Extend message creation to accept media[]. Persist media metadata alongside messages for rendering.
## Frontend
- Composer component changes:
- Add hidden <input type="file" multiple> bound to Attach button.
- Drag & drop overlay and handlers (dragenter/dragover/drop) on composer area.
- Clipboard paste handler: capture image blobs and add to attachment queue.
- Attachment queue state: { id, file|blob, name, size, previewURL, status: queued|uploading|done|error, serverPath? }[]
- On send: for any queued files not uploaded, call /api/media/upload, show per-file progress. Then call send-message with text + media paths.
- After success: clear input and queue.
- Message renderer: if message.media exists, render thumbnails for images and file tiles for others; clicking downloads the file.
## Config
- gateway.webchat.uploads.maxItemMB (default 25)
- gateway.webchat.uploads.maxFilesPerMessage (default 5)
- gateway.webchat.uploads.accept (default images+generic: "image/*,application/pdf,.zip,.txt")
## Security
- Store under per-instance media directory; do not serve arbitrary FS paths.
- Serve media via controlled route with auth (or signed URLs) to avoid leaking private files.
- Sanitize filenames; block executables by default if desired.
## Acceptance Criteria
- Drag a screenshot into composer -> shows chip -> send -> message appears with thumbnail; assistant receives MEDIA:/path.
- Click attach to select multiple files -> progress -> all appear in transcript.
- Paste an image from clipboard -> becomes attachment and can be sent.
- Oversize file is rejected with a clear error.
## Test Plan
- Unit: multipart handler, limit enforcement, message payload schema.
- E2E: browser test for drag/drop, paste, attach; verify message persistence and media retrieval.
## Open Questions
- Do we want per-file captions? For MVP we can skip.
- Serving media: public path vs. signed URL behind auth — follow current Gateway approach.
## Task Breakdown
- [ ] Backend: media upload endpoint + storage
- [ ] Backend: extend message API to accept media[]
- [ ] Frontend: composer (attach/drag/paste) + queue + progress
- [ ] Frontend: message renderer for media
- [ ] Config + docs
- [ ] Tests

View File

@ -0,0 +1,69 @@
// Draft implementation for Gateway media upload (Fastify-style)
// Adjust to the actual server framework (Fastify/Express) used by clawdbot gateway.
import path from 'node:path'
import fs from 'node:fs/promises'
import { randomUUID } from 'node:crypto'
// Pseudo helpers replace with gateway's real config/log/auth utilities
const CONFIG = {
uploads: {
maxItemBytes: 25 * 1024 * 1024,
accept: [ 'image/', 'application/pdf', 'text/plain', 'application/zip' ],
},
mediaDir: path.resolve(process.env.CLAWDBOT_MEDIA_DIR || path.join(process.env.HOME || process.cwd(), '.clawdbot', 'media', 'inbound')),
}
function ensureDir(p: string) {
return fs.mkdir(p, { recursive: true })
}
function inferExt(mime: string, name?: string) {
const fromName = name && path.extname(name)
if (fromName) return fromName
if (mime.startsWith('image/')) return '.png'
if (mime === 'application/pdf') return '.pdf'
return ''
}
function allowed(mime: string) {
return CONFIG.uploads.accept.some(a => a.endsWith('/*') ? mime.startsWith(a.slice(0, -1)) : mime === a)
}
export async function registerMediaUploadRoute(fastify: any) {
await ensureDir(CONFIG.mediaDir)
fastify.register(import('@fastify/multipart'), { limits: { fileSize: CONFIG.uploads.maxItemBytes } })
fastify.post('/api/media/upload', async (req: any, reply: any) => {
const mp = await req.parts()
const files: any[] = []
for await (const part of mp) {
if (part.type !== 'file') continue
const { filename, mimetype } = part
if (!allowed(mimetype)) {
await part.file?.resume()
return reply.code(415).send({ error: 'unsupported_type', mimetype })
}
const id = randomUUID()
const ext = inferExt(mimetype, filename)
const outPath = path.join(CONFIG.mediaDir, `${id}${ext}`)
const chunks: Buffer[] = []
let size = 0
for await (const chunk of part.file) {
size += chunk.length
if (size > CONFIG.uploads.maxItemBytes) {
return reply.code(413).send({ error: 'file_too_large' })
}
chunks.push(chunk)
}
const buf = Buffer.concat(chunks)
await fs.writeFile(outPath, buf)
const mediaPath = `MEDIA:${outPath}`
files.push({ path: mediaPath, mime: mimetype, name: filename, size })
}
return reply.send({ files })
})
}

View File

@ -0,0 +1,100 @@
// Draft React snippet: add attach button + drag & drop + paste image support
// Integrate into the existing web console front-end (adjust imports/styles/state mgmt).
import React, { useCallback, useEffect, useRef, useState } from 'react'
type QueueItem = {
id: string
file: File
previewURL?: string
status: 'queued' | 'uploading' | 'done' | 'error'
serverPath?: string
error?: string
}
async function uploadFiles(items: QueueItem[]): Promise<QueueItem[]> {
const form = new FormData()
items.forEach(i => form.append('file', i.file, i.file.name))
const res = await fetch('/api/media/upload', { method: 'POST', body: form })
if (!res.ok) throw new Error(`upload failed: ${res.status}`)
const json = await res.json()
return items.map((it, idx) => ({ ...it, status: 'done', serverPath: json.files[idx]?.path }))
}
export function ChatComposerAttachment({ onSend }: { onSend: (payload: { text?: string, media?: { path: string }[] }) => Promise<void> }) {
const [text, setText] = useState('')
const [queue, setQueue] = useState<QueueItem[]>([])
const inputRef = useRef<HTMLInputElement>(null)
const pickFiles = () => inputRef.current?.click()
const onFileChange = (e: React.ChangeEvent<HTMLInputElement>) => {
const files = Array.from(e.target.files || [])
const next = files.map(f => ({ id: crypto.randomUUID(), file: f, previewURL: f.type.startsWith('image/') ? URL.createObjectURL(f) : undefined, status: 'queued' as const }))
setQueue(q => [...q, ...next])
}
const onDrop = (e: React.DragEvent) => {
e.preventDefault()
const files = Array.from(e.dataTransfer.files || [])
const next = files.map(f => ({ id: crypto.randomUUID(), file: f, previewURL: f.type.startsWith('image/') ? URL.createObjectURL(f) : undefined, status: 'queued' as const }))
setQueue(q => [...q, ...next])
}
const onPaste = useCallback((e: ClipboardEvent) => {
const items = Array.from(e.clipboardData?.items || [])
const files = items.filter(i => i.kind === 'file').map(i => i.getAsFile()).filter(Boolean) as File[]
if (files.length) {
e.preventDefault()
const next = files.map(f => ({ id: crypto.randomUUID(), file: f, previewURL: f.type.startsWith('image/') ? URL.createObjectURL(f) : undefined, status: 'queued' as const }))
setQueue(q => [...q, ...next])
}
}, [])
useEffect(() => {
const handler = (ev: ClipboardEvent) => onPaste(ev)
window.addEventListener('paste', handler)
return () => window.removeEventListener('paste', handler)
}, [onPaste])
const send = async () => {
// upload queued files
const queued = queue.filter(q => q.status === 'queued')
let uploaded: QueueItem[] = []
if (queued.length) {
setQueue(q => q.map(i => i.status === 'queued' ? { ...i, status: 'uploading' } : i))
try {
uploaded = await uploadFiles(queued)
setQueue(q => q.map(i => i.status !== 'queued' ? i : uploaded.find(u => u.file === i.file) || i))
} catch (e: any) {
setQueue(q => q.map(i => i.status === 'uploading' ? { ...i, status: 'error', error: String(e?.message || e) } : i))
return
}
}
const media = queue.map(i => i.serverPath).filter(Boolean).map(p => ({ path: p! }))
await onSend({ text: text.trim() || undefined, media: media.length ? media : undefined })
setText('')
setQueue([])
}
return (
<div className="composer" onDragOver={e => e.preventDefault()} onDrop={onDrop}>
<input ref={inputRef} type="file" multiple hidden onChange={onFileChange} />
<button type="button" onClick={pickFiles} aria-label="Attach">📎</button>
<input value={text} onChange={e => setText(e.target.value)} placeholder="Message (paste images or drag files here)" onKeyDown={e => { if (e.key === 'Enter' && !e.shiftKey) { e.preventDefault(); send() } }} />
<button type="button" onClick={send}>Send</button>
{queue.length > 0 && (
<div className="attachments">
{queue.map(it => (
<div key={it.id} className={`chip ${it.status}`}>
{it.previewURL ? <img src={it.previewURL} alt={it.file.name} /> : <span className="file-icon" />}
<span className="name">{it.file.name}</span>
<span className="size">{(it.file.size/1024).toFixed(1)} KB</span>
{it.status === 'uploading' && <span className="progress" />}
</div>
))}
</div>
)}
</div>
)
}

View File

@ -0,0 +1,25 @@
import React from 'react'
type Media = { path: string, mime?: string, name?: string }
function isImage(m?: string, path?: string) {
return (m && m.startsWith('image/')) || (path && path.match(/\.(png|jpe?g|webp|gif)$/i))
}
export function MessageMedia({ media }: { media: Media[] }) {
if (!media?.length) return null
return (
<div className="message-media">
{media.map((m, i) => (
<div key={i} className="media-item">
{isImage(m.mime, m.path) ? (
// MEDIA:/absolute/path should be proxied/served by the gateway; adjust URL transformation accordingly
<img src={m.path.replace(/^MEDIA:/, '/media/')} alt={m.name || 'image'} />
) : (
<a href={m.path.replace(/^MEDIA:/, '/media/')} target="_blank" rel="noreferrer">{m.name || 'file'}</a>
)}
</div>
))}
</div>
)
}