Merge remote-tracking branch 'origin/main' into upstream-preview-nix-2025-12-20

This commit is contained in:
Peter Steinberger 2026-01-01 09:15:28 +01:00
commit ad9a9d8d35
163 changed files with 10867 additions and 1712 deletions

1
.gitignore vendored
View File

@ -8,6 +8,7 @@ coverage
.worktrees/
.DS_Store
**/.DS_Store
ui/src/ui/__screenshots__/
# Bun build artifacts
*.bun-build

View File

@ -23,6 +23,7 @@
- Naming: match source names with `*.test.ts`; e2e in `*.e2e.test.ts`.
- Run `pnpm test` (or `pnpm test:coverage`) before pushing when you touch logic.
- Pure test additions/fixes generally do **not** need a changelog entry unless they alter user-facing behavior or the user asks for one.
- Mobile: before using a simulator, check for connected real devices (iOS + Android) and prefer them when available.
## Commit & Pull Request Guidelines
- Create commits with `scripts/committer "<msg>" <file...>`; avoid manual `git add`/`git commit` so staging stays scoped.
@ -41,6 +42,9 @@
- Also read the shared guardrails at `~/Projects/oracle/AGENTS.md` and `~/Projects/agent-scripts/AGENTS.MD` before making changes; align with any cross-repo rules noted there.
- SwiftUI state management (iOS/macOS): prefer the `Observation` framework (`@Observable`, `@Bindable`) over `ObservableObject`/`@StateObject`; dont introduce new `ObservableObject` unless required for compatibility, and migrate existing usages when touching related code.
- **Restart apps:** “restart iOS/Android apps” means rebuild (recompile/install) and relaunch, not just kill/launch.
- **Device checks:** before testing, verify connected real devices (iOS/Android) before reaching for simulators/emulators.
- iOS Team ID lookup: `security find-identity -p codesigning -v` → use Apple Development (…) TEAMID. Fallback: `defaults read com.apple.dt.Xcode IDEProvisioningTeamIdentifiers`.
- A2UI bundle hash: `src/canvas-host/a2ui/.bundle.hash` is auto-generated; regenerate via `pnpm canvas:a2ui:bundle` (or `scripts/bundle-a2ui.sh`) instead of manual conflict resolution.
- Notary key file lives at `~/Library/CloudStorage/Dropbox/Backup/AppStore/AuthKey_NJF3NFGTS3.p8` (Sparkle keys live under `~/Library/CloudStorage/Dropbox/Backup/Sparkle`).
- **Multi-agent safety:** do **not** create/apply/drop `git stash` entries unless Peter explicitly asks (this includes `git pull --rebase --autostash`). Assume other agents may be working; keep unrelated WIP untouched and avoid cross-cutting state changes.
- **Multi-agent safety:** when Peter says "push", you may `git pull --rebase` to integrate latest changes (never discard other agents' work). When Peter says "commit", scope to your changes only. When Peter says "commit all", commit everything in grouped chunks.

View File

@ -2,8 +2,66 @@
## 2.0.0-beta5 — Unreleased
### Features
- Talk mode: continuous speech conversations (macOS/iOS/Android) with ElevenLabs TTS, reply directives, and optional interrupt-on-speech.
- UI: add optional `ui.seamColor` accent to tint the Talk Mode side bubble (macOS/iOS/Android).
- Agent runtime: accept legacy `Z_AI_API_KEY` for Z.AI provider auth (maps to `ZAI_API_KEY`).
- Tests: add a Z.AI live test gate for smoke validation when keys are present.
- macOS Debug: add app log verbosity and rolling file log toggle for swift-log-backed app logs.
### Fixes
- Docs/agent tools: clarify that browser `wait` should be avoided by default and used only in exceptional cases.
- macOS: Voice Wake now fully tears down the Speech pipeline when disabled (cancel pending restarts, drop stale callbacks) to avoid high CPU in the background.
- macOS menu: add a Talk Mode action alongside the Open Dashboard/Chat/Canvas entries.
- macOS Debug: hide “Restart Gateway” when the app wont start a local gateway (remote mode / attach-only).
- macOS Talk Mode: orb overlay refresh, ElevenLabs request logging, API key status in settings, and auto-select first voice when none is configured.
- macOS Talk Mode: add hard timeout around ElevenLabs TTS synthesis to avoid getting stuck “speaking” forever on hung requests.
- macOS Talk Mode: avoid stuck playback when the audio player never starts (fail-fast + watchdog).
- macOS Talk Mode: fix audio stop ordering so disabling Talk Mode always stops in-flight playback.
- macOS Talk Mode: throttle audio-level updates (avoid per-buffer task creation) to reduce CPU/task churn.
- macOS Talk Mode: increase overlay window size so wave rings dont clip; close button is hover-only and closer to the orb.
- Talk Mode: fall back to system TTS when ElevenLabs is unavailable, returns non-audio, or playback fails (macOS/iOS/Android).
- Talk Mode: stream PCM on macOS/iOS for lower latency (incremental playback); Android continues MP3 streaming.
- Talk Mode: validate ElevenLabs v3 stability and latency tier directives before sending requests.
- iOS/Android Talk Mode: auto-select the first ElevenLabs voice when none is configured.
- ElevenLabs: add retry/backoff for 429/5xx and include content-type in errors for debugging.
- Talk Mode: align to the gateways main session key and fall back to history polling when chat events drop (prevents stuck “thinking” / missing messages).
- Talk Mode: treat history timestamps as seconds or milliseconds to avoid stale assistant picks (macOS/iOS/Android).
- Chat UI: clear streaming/tool bubbles when external runs finish, preventing duplicate assistant bubbles.
- Chat UI: user bubbles use `ui.seamColor` (fallback to a calmer default blue).
- Android Chat UI: use `onPrimary` for user bubble text to preserve contrast (thanks @Syhids).
- Control UI: sync sidebar navigation with the URL for deep-linking, and auto-scroll chat to the latest message.
- Control UI: disable Web Chat + Talk when no iOS/Android node is connected; refreshed Web Chat styling and keyboard send.
- macOS: bundle Control UI assets into the app relay so the packaged app can serve them (thanks @mbelinky).
- Talk Mode: wait for chat history to surface the assistant reply before starting TTS (macOS/iOS/Android).
- iOS Talk Mode: fix chat completion wait to time out even if no events arrive (prevents “Thinking…” hangs).
- iOS Talk Mode: keep recognition running during playback to support interrupt-on-speech.
- iOS Talk Mode: preserve directive voice/model overrides across config reloads and add ElevenLabs request timeouts.
- iOS/Android Talk Mode: explicitly `chat.subscribe` when Talk Mode is active, so completion events arrive even if the Chat UI isnt open.
- Chat UI: refresh history when another client finishes a run in the same session, so Talk Mode + Voice Wake transcripts appear consistently.
- Gateway: `voice.transcript` now also maps agent bus output to `chat` events, ensuring chat UIs refresh for voice-triggered runs.
- iOS/Android: show a centered Talk Mode orb overlay while Talk Mode is enabled.
- Gateway config: inject `talk.apiKey` from `ELEVENLABS_API_KEY`/shell profile so nodes can fetch it on demand.
- Canvas A2UI: tag requests with `platform=android|ios|macos` and boost Android canvas background contrast.
- iOS/Android nodes: enable scrolling for loaded web pages in the Canvas WebView (default scaffold stays touch-first).
- macOS menu: device list now uses `node.list` (devices only; no agent/tool presence entries).
- macOS menu: device list now shows connected nodes only.
- macOS menu: device rows now pack platform/version on the first line, and command lists wrap in submenus.
- macOS menu: split device platform/version across first and second rows for better fit.
- iOS node: fix ReplayKit screen recording crash caused by queue isolation assertions during capture.
- iOS Talk Mode: avoid audio tap queue assertions when starting recognition.
- macOS: use $HOME/Library/pnpm for SSH PATH exports (thanks @mbelinky).
- iOS/Android nodes: bridge auto-connect refreshes stale tokens and settings now show richer bridge/device details.
- macOS: bundle device model resources to prevent Instances crashes (thanks @mbelinky).
- iOS/Android nodes: status pill now surfaces camera activity instead of overlay toasts.
- iOS/Android/macOS nodes: camera snaps recompress to keep base64 payloads under 5 MB.
- iOS/Android nodes: status pill now surfaces pairing, screen recording, voice wake, and foreground-required states.
- iOS/Android nodes: avoid duplicating “Gateway reconnecting…” when the bridge is already connecting.
- iOS/Android nodes: Talk Mode now lives on a side bubble (with an iOS toggle to hide it), and Android settings no longer show the Talk Mode switch.
- macOS menu: top status line now shows pending node pairing approvals (incl. repairs).
- CLI: avoid spurious gateway close errors after successful request/response cycles.
- Agent runtime: clamp tool-result images to the 5MB Anthropic limit to avoid hard request rejections.
- Tests: add Swift Testing coverage for camera errors and Kotest coverage for Android bridge endpoints.
## 2.0.0-beta4 — 2025-12-27

View File

@ -19,6 +19,8 @@ It answers you on the surfaces you already use (WhatsApp, Telegram, Discord, Web
If you want a private, single-user assistant that feels local, fast, and always-on, this is it.
Using Claude Pro/Max subscription? See `docs/onboarding.md` for the Anthropic OAuth setup.
```
Your surfaces

View File

@ -64,6 +64,7 @@ dependencies {
implementation("androidx.core:core-ktx:1.17.0")
implementation("androidx.lifecycle:lifecycle-runtime-ktx:2.10.0")
implementation("androidx.activity:activity-compose:1.12.2")
implementation("androidx.webkit:webkit:1.14.0")
implementation("androidx.compose.ui:ui")
implementation("androidx.compose.ui:ui-tooling-preview")
@ -93,4 +94,11 @@ dependencies {
testImplementation("junit:junit:4.13.2")
testImplementation("org.jetbrains.kotlinx:kotlinx-coroutines-test:1.10.2")
testImplementation("io.kotest:kotest-runner-junit5-jvm:6.0.7")
testImplementation("io.kotest:kotest-assertions-core-jvm:6.0.7")
testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.13.3")
}
tasks.withType<Test>().configureEach {
useJUnitPlatform()
}

View File

@ -23,9 +23,12 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val statusText: StateFlow<String> = runtime.statusText
val serverName: StateFlow<String?> = runtime.serverName
val remoteAddress: StateFlow<String?> = runtime.remoteAddress
val isForeground: StateFlow<Boolean> = runtime.isForeground
val seamColorArgb: StateFlow<Long> = runtime.seamColorArgb
val cameraHud: StateFlow<CameraHudState?> = runtime.cameraHud
val cameraFlashToken: StateFlow<Long> = runtime.cameraFlashToken
val screenRecordActive: StateFlow<Boolean> = runtime.screenRecordActive
val instanceId: StateFlow<String> = runtime.instanceId
val displayName: StateFlow<String> = runtime.displayName
@ -35,6 +38,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
val voiceWakeMode: StateFlow<VoiceWakeMode> = runtime.voiceWakeMode
val voiceWakeStatusText: StateFlow<String> = runtime.voiceWakeStatusText
val voiceWakeIsListening: StateFlow<Boolean> = runtime.voiceWakeIsListening
val talkEnabled: StateFlow<Boolean> = runtime.talkEnabled
val talkStatusText: StateFlow<String> = runtime.talkStatusText
val talkIsListening: StateFlow<Boolean> = runtime.talkIsListening
val talkIsSpeaking: StateFlow<Boolean> = runtime.talkIsSpeaking
val manualEnabled: StateFlow<Boolean> = runtime.manualEnabled
val manualHost: StateFlow<String> = runtime.manualHost
val manualPort: StateFlow<Int> = runtime.manualPort
@ -95,6 +102,10 @@ class MainViewModel(app: Application) : AndroidViewModel(app) {
runtime.setVoiceWakeMode(mode)
}
fun setTalkEnabled(enabled: Boolean) {
runtime.setTalkEnabled(enabled)
}
fun connect(endpoint: BridgeEndpoint) {
runtime.connect(endpoint)
}

View File

@ -25,6 +25,7 @@ import com.steipete.clawdis.node.protocol.ClawdisCanvasA2UIAction
import com.steipete.clawdis.node.protocol.ClawdisCanvasA2UICommand
import com.steipete.clawdis.node.protocol.ClawdisCanvasCommand
import com.steipete.clawdis.node.protocol.ClawdisScreenCommand
import com.steipete.clawdis.node.voice.TalkModeManager
import com.steipete.clawdis.node.voice.VoiceWakeManager
import kotlinx.coroutines.CoroutineScope
import kotlinx.coroutines.Dispatchers
@ -69,7 +70,7 @@ class NodeRuntime(context: Context) {
payloadJson =
buildJsonObject {
put("message", JsonPrimitive(command))
put("sessionKey", JsonPrimitive("main"))
put("sessionKey", JsonPrimitive(mainSessionKey.value))
put("thinking", JsonPrimitive(chatThinkingLevel.value))
put("deliver", JsonPrimitive(false))
}.toString(),
@ -84,6 +85,15 @@ class NodeRuntime(context: Context) {
val voiceWakeStatusText: StateFlow<String>
get() = voiceWake.statusText
val talkStatusText: StateFlow<String>
get() = talkMode.statusText
val talkIsListening: StateFlow<Boolean>
get() = talkMode.isListening
val talkIsSpeaking: StateFlow<Boolean>
get() = talkMode.isSpeaking
private val discovery = BridgeDiscovery(appContext, scope = scope)
val bridges: StateFlow<List<BridgeEndpoint>> = discovery.bridges
val discoveryStatusText: StateFlow<String> = discovery.statusText
@ -94,6 +104,9 @@ class NodeRuntime(context: Context) {
private val _statusText = MutableStateFlow("Offline")
val statusText: StateFlow<String> = _statusText.asStateFlow()
private val _mainSessionKey = MutableStateFlow("main")
val mainSessionKey: StateFlow<String> = _mainSessionKey.asStateFlow()
private val cameraHudSeq = AtomicLong(0)
private val _cameraHud = MutableStateFlow<CameraHudState?>(null)
val cameraHud: StateFlow<CameraHudState?> = _cameraHud.asStateFlow()
@ -101,12 +114,18 @@ class NodeRuntime(context: Context) {
private val _cameraFlashToken = MutableStateFlow(0L)
val cameraFlashToken: StateFlow<Long> = _cameraFlashToken.asStateFlow()
private val _screenRecordActive = MutableStateFlow(false)
val screenRecordActive: StateFlow<Boolean> = _screenRecordActive.asStateFlow()
private val _serverName = MutableStateFlow<String?>(null)
val serverName: StateFlow<String?> = _serverName.asStateFlow()
private val _remoteAddress = MutableStateFlow<String?>(null)
val remoteAddress: StateFlow<String?> = _remoteAddress.asStateFlow()
private val _seamColorArgb = MutableStateFlow(DEFAULT_SEAM_COLOR_ARGB)
val seamColorArgb: StateFlow<Long> = _seamColorArgb.asStateFlow()
private val _isForeground = MutableStateFlow(true)
val isForeground: StateFlow<Boolean> = _isForeground.asStateFlow()
@ -120,6 +139,8 @@ class NodeRuntime(context: Context) {
_serverName.value = name
_remoteAddress.value = remote
_isConnected.value = true
_seamColorArgb.value = DEFAULT_SEAM_COLOR_ARGB
scope.launch { refreshBrandingFromGateway() }
scope.launch { refreshWakeWordsFromGateway() }
maybeNavigateToA2uiOnConnect()
},
@ -133,12 +154,17 @@ class NodeRuntime(context: Context) {
)
private val chat = ChatController(scope = scope, session = session, json = json)
private val talkMode: TalkModeManager by lazy {
TalkModeManager(context = appContext, scope = scope).also { it.attachSession(session) }
}
private fun handleSessionDisconnected(message: String) {
_statusText.value = message
_serverName.value = null
_remoteAddress.value = null
_isConnected.value = false
_seamColorArgb.value = DEFAULT_SEAM_COLOR_ARGB
_mainSessionKey.value = "main"
chat.onDisconnected(message)
showLocalCanvasOnDisconnect()
}
@ -163,6 +189,7 @@ class NodeRuntime(context: Context) {
val preventSleep: StateFlow<Boolean> = prefs.preventSleep
val wakeWords: StateFlow<List<String>> = prefs.wakeWords
val voiceWakeMode: StateFlow<VoiceWakeMode> = prefs.voiceWakeMode
val talkEnabled: StateFlow<Boolean> = prefs.talkEnabled
val manualEnabled: StateFlow<Boolean> = prefs.manualEnabled
val manualHost: StateFlow<String> = prefs.manualHost
val manualPort: StateFlow<Int> = prefs.manualPort
@ -218,6 +245,13 @@ class NodeRuntime(context: Context) {
}
}
scope.launch {
talkEnabled.collect { enabled ->
talkMode.setEnabled(enabled)
externalAudioCaptureActive.value = enabled
}
}
scope.launch(Dispatchers.Default) {
bridges.collect { list ->
if (list.isNotEmpty()) {
@ -311,6 +345,10 @@ class NodeRuntime(context: Context) {
prefs.setVoiceWakeMode(mode)
}
fun setTalkEnabled(value: Boolean) {
prefs.setTalkEnabled(value)
}
fun connect(endpoint: BridgeEndpoint) {
scope.launch {
_statusText.value = "Connecting…"
@ -548,6 +586,7 @@ class NodeRuntime(context: Context) {
return
}
talkMode.handleBridgeEvent(event, payloadJson)
chat.handleBridgeEvent(event, payloadJson)
}
@ -589,6 +628,25 @@ class NodeRuntime(context: Context) {
}
}
private suspend fun refreshBrandingFromGateway() {
if (!_isConnected.value) return
try {
val res = session.request("config.get", "{}")
val root = json.parseToJsonElement(res).asObjectOrNull()
val config = root?.get("config").asObjectOrNull()
val ui = config?.get("ui").asObjectOrNull()
val raw = ui?.get("seamColor").asStringOrNull()?.trim()
val sessionCfg = config?.get("session").asObjectOrNull()
val rawMainKey = sessionCfg?.get("mainKey").asStringOrNull()?.trim()
_mainSessionKey.value = rawMainKey?.takeIf { it.isNotEmpty() } ?: "main"
val parsed = parseHexColorArgb(raw)
_seamColorArgb.value = parsed ?: DEFAULT_SEAM_COLOR_ARGB
} catch (_: Throwable) {
// ignore
}
}
private suspend fun handleInvoke(command: String, paramsJson: String?): BridgeSession.InvokeResult {
if (
command.startsWith(ClawdisCanvasCommand.NamespacePrefix) ||
@ -730,14 +788,20 @@ class NodeRuntime(context: Context) {
}
}
ClawdisScreenCommand.Record.rawValue -> {
val res =
try {
screenRecorder.record(paramsJson)
} catch (err: Throwable) {
val (code, message) = invokeErrorFromThrowable(err)
return BridgeSession.InvokeResult.error(code = code, message = message)
}
BridgeSession.InvokeResult.ok(res.payloadJson)
// Status pill mirrors screen recording state so it stays visible without overlay stacking.
_screenRecordActive.value = true
try {
val res =
try {
screenRecorder.record(paramsJson)
} catch (err: Throwable) {
val (code, message) = invokeErrorFromThrowable(err)
return BridgeSession.InvokeResult.error(code = code, message = message)
}
BridgeSession.InvokeResult.ok(res.payloadJson)
} finally {
_screenRecordActive.value = false
}
}
else ->
BridgeSession.InvokeResult.error(
@ -780,7 +844,7 @@ class NodeRuntime(context: Context) {
val raw = session.currentCanvasHostUrl()?.trim().orEmpty()
if (raw.isBlank()) return null
val base = raw.trimEnd('/')
return "${base}/__clawdis__/a2ui/"
return "${base}/__clawdis__/a2ui/?platform=android"
}
private suspend fun ensureA2uiReady(a2uiUrl: String): Boolean {
@ -866,6 +930,8 @@ class NodeRuntime(context: Context) {
private data class Quad<A, B, C, D>(val first: A, val second: B, val third: C, val fourth: D)
private const val DEFAULT_SEAM_COLOR_ARGB: Long = 0xFF4F7A9A
private const val a2uiReadyCheckJS: String =
"""
(() => {
@ -920,3 +986,12 @@ private fun JsonElement?.asStringOrNull(): String? =
is JsonPrimitive -> content
else -> null
}
private fun parseHexColorArgb(raw: String?): Long? {
val trimmed = raw?.trim().orEmpty()
if (trimmed.isEmpty()) return null
val hex = if (trimmed.startsWith("#")) trimmed.drop(1) else trimmed
if (hex.length != 6) return null
val rgb = hex.toLongOrNull(16) ?: return null
return 0xFF000000L or rgb
}

View File

@ -73,6 +73,9 @@ class SecurePrefs(context: Context) {
private val _voiceWakeMode = MutableStateFlow(loadVoiceWakeMode())
val voiceWakeMode: StateFlow<VoiceWakeMode> = _voiceWakeMode
private val _talkEnabled = MutableStateFlow(prefs.getBoolean("talk.enabled", false))
val talkEnabled: StateFlow<Boolean> = _talkEnabled
fun setLastDiscoveredStableId(value: String) {
val trimmed = value.trim()
prefs.edit { putString("bridge.lastDiscoveredStableId", trimmed) }
@ -158,6 +161,11 @@ class SecurePrefs(context: Context) {
_voiceWakeMode.value = mode
}
fun setTalkEnabled(value: Boolean) {
prefs.edit { putBoolean("talk.enabled", value) }
_talkEnabled.value = value
}
private fun loadVoiceWakeMode(): VoiceWakeMode {
val raw = prefs.getString(voiceWakeModeKey, null)
val resolved = VoiceWakeMode.fromRawValue(raw)

View File

@ -130,20 +130,36 @@ class BridgeDiscovery(
object : NsdManager.ResolveListener {
override fun onResolveFailed(serviceInfo: NsdServiceInfo, errorCode: Int) {}
override fun onServiceResolved(resolved: NsdServiceInfo) {
val host = resolved.host?.hostAddress ?: return
val port = resolved.port
if (port <= 0) return
override fun onServiceResolved(resolved: NsdServiceInfo) {
val host = resolved.host?.hostAddress ?: return
val port = resolved.port
if (port <= 0) return
val rawServiceName = resolved.serviceName
val serviceName = BonjourEscapes.decode(rawServiceName)
val displayName = BonjourEscapes.decode(txt(resolved, "displayName") ?: serviceName)
val id = stableId(serviceName, "local.")
localById[id] = BridgeEndpoint(stableId = id, name = displayName, host = host, port = port)
publish()
}
},
)
val rawServiceName = resolved.serviceName
val serviceName = BonjourEscapes.decode(rawServiceName)
val displayName = BonjourEscapes.decode(txt(resolved, "displayName") ?: serviceName)
val lanHost = txt(resolved, "lanHost")
val tailnetDns = txt(resolved, "tailnetDns")
val gatewayPort = txtInt(resolved, "gatewayPort")
val bridgePort = txtInt(resolved, "bridgePort")
val canvasPort = txtInt(resolved, "canvasPort")
val id = stableId(serviceName, "local.")
localById[id] =
BridgeEndpoint(
stableId = id,
name = displayName,
host = host,
port = port,
lanHost = lanHost,
tailnetDns = tailnetDns,
gatewayPort = gatewayPort,
bridgePort = bridgePort,
canvasPort = canvasPort,
)
publish()
}
},
)
}
private fun publish() {
@ -189,6 +205,10 @@ class BridgeDiscovery(
}
}
private fun txtInt(info: NsdServiceInfo, key: String): Int? {
return txt(info, key)?.toIntOrNull()
}
private suspend fun refreshUnicast(domain: String) {
val ptrName = "${serviceType}${domain}"
val ptrMsg = lookupUnicastMessage(ptrName, Type.PTR) ?: return
@ -227,8 +247,24 @@ class BridgeDiscovery(
}
val instanceName = BonjourEscapes.decode(decodeInstanceName(instanceFqdn, domain))
val displayName = BonjourEscapes.decode(txtValue(txt, "displayName") ?: instanceName)
val lanHost = txtValue(txt, "lanHost")
val tailnetDns = txtValue(txt, "tailnetDns")
val gatewayPort = txtIntValue(txt, "gatewayPort")
val bridgePort = txtIntValue(txt, "bridgePort")
val canvasPort = txtIntValue(txt, "canvasPort")
val id = stableId(instanceName, domain)
next[id] = BridgeEndpoint(stableId = id, name = displayName, host = host, port = port)
next[id] =
BridgeEndpoint(
stableId = id,
name = displayName,
host = host,
port = port,
lanHost = lanHost,
tailnetDns = tailnetDns,
gatewayPort = gatewayPort,
bridgePort = bridgePort,
canvasPort = canvasPort,
)
}
unicastById.clear()
@ -434,6 +470,10 @@ class BridgeDiscovery(
return null
}
private fun txtIntValue(records: List<TXTRecord>, key: String): Int? {
return txtValue(records, key)?.toIntOrNull()
}
private fun decodeDnsTxtString(raw: String): String {
// dnsjava treats TXT as opaque bytes and decodes as ISO-8859-1 to preserve bytes.
// Our TXT payload is UTF-8 (written by the gateway), so re-decode when possible.

View File

@ -5,6 +5,11 @@ data class BridgeEndpoint(
val name: String,
val host: String,
val port: Int,
val lanHost: String? = null,
val tailnetDns: String? = null,
val gatewayPort: Int? = null,
val bridgePort: Int? = null,
val canvasPort: Int? = null,
) {
companion object {
fun manual(host: String, port: Int): BridgeEndpoint =
@ -16,4 +21,3 @@ data class BridgeEndpoint(
)
}
}

View File

@ -11,6 +11,7 @@ import kotlinx.coroutines.launch
import kotlinx.coroutines.sync.Mutex
import kotlinx.coroutines.sync.withLock
import kotlinx.coroutines.withContext
import com.steipete.clawdis.node.BuildConfig
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.JsonArray
import kotlinx.serialization.json.JsonObject
@ -23,6 +24,7 @@ import java.io.BufferedWriter
import java.io.InputStreamReader
import java.io.OutputStreamWriter
import java.net.InetSocketAddress
import java.net.URI
import java.net.Socket
import java.util.UUID
import java.util.concurrent.ConcurrentHashMap
@ -75,6 +77,8 @@ class BridgeSession(
fun disconnect() {
desired = null
// Unblock connectOnce() read loop. Coroutine cancellation alone won't interrupt BufferedReader.readLine().
currentConnection?.closeQuietly()
scope.launch(Dispatchers.IO) {
job?.cancelAndJoin()
job = null
@ -213,7 +217,17 @@ class BridgeSession(
when (first["type"].asStringOrNull()) {
"hello-ok" -> {
val name = first["serverName"].asStringOrNull() ?: "Bridge"
canvasHostUrl = first["canvasHostUrl"].asStringOrNull()?.trim()?.ifEmpty { null }
val rawCanvasUrl = first["canvasHostUrl"].asStringOrNull()?.trim()?.ifEmpty { null }
canvasHostUrl = normalizeCanvasHostUrl(rawCanvasUrl, endpoint)
if (BuildConfig.DEBUG) {
// Local JVM unit tests use android.jar stubs; Log.d can throw "not mocked".
runCatching {
android.util.Log.d(
"ClawdisBridge",
"canvasHostUrl resolved=${canvasHostUrl ?: "none"} (raw=${rawCanvasUrl ?: "none"})",
)
}
}
onConnected(name, conn.remoteAddress)
}
"error" -> {
@ -292,6 +306,37 @@ class BridgeSession(
conn.closeQuietly()
}
}
private fun normalizeCanvasHostUrl(raw: String?, endpoint: BridgeEndpoint): String? {
val trimmed = raw?.trim().orEmpty()
val parsed = trimmed.takeIf { it.isNotBlank() }?.let { runCatching { URI(it) }.getOrNull() }
val host = parsed?.host?.trim().orEmpty()
val port = parsed?.port ?: -1
val scheme = parsed?.scheme?.trim().orEmpty().ifBlank { "http" }
if (trimmed.isNotBlank() && !isLoopbackHost(host)) {
return trimmed
}
val fallbackHost =
endpoint.tailnetDns?.trim().takeIf { !it.isNullOrEmpty() }
?: endpoint.lanHost?.trim().takeIf { !it.isNullOrEmpty() }
?: endpoint.host.trim()
if (fallbackHost.isEmpty()) return trimmed.ifBlank { null }
val fallbackPort = endpoint.canvasPort ?: if (port > 0) port else 18793
val formattedHost = if (fallbackHost.contains(":")) "[${fallbackHost}]" else fallbackHost
return "$scheme://$formattedHost:$fallbackPort"
}
private fun isLoopbackHost(raw: String?): Boolean {
val host = raw?.trim()?.lowercase().orEmpty()
if (host.isEmpty()) return false
if (host == "localhost") return true
if (host == "::1") return true
if (host == "0.0.0.0" || host == "::") return true
return host.startsWith("127.")
}
}
private fun JsonElement?.asObjectOrNull(): JsonObject? = this as? JsonObject

View File

@ -28,6 +28,7 @@ import kotlinx.coroutines.withContext
import java.io.ByteArrayOutputStream
import java.io.File
import java.util.concurrent.Executor
import kotlin.math.roundToInt
import kotlin.coroutines.resume
import kotlin.coroutines.resumeWithException
@ -99,14 +100,36 @@ class CameraCaptureManager(private val context: Context) {
decoded
}
val out = ByteArrayOutputStream()
val jpegQuality = (quality * 100.0).toInt().coerceIn(10, 100)
if (!scaled.compress(Bitmap.CompressFormat.JPEG, jpegQuality, out)) {
throw IllegalStateException("UNAVAILABLE: failed to encode JPEG")
}
val base64 = Base64.encodeToString(out.toByteArray(), Base64.NO_WRAP)
val maxPayloadBytes = 5 * 1024 * 1024
// Base64 inflates payloads by ~4/3; cap encoded bytes so the payload stays under 5MB (API limit).
val maxEncodedBytes = (maxPayloadBytes / 4) * 3
val result =
JpegSizeLimiter.compressToLimit(
initialWidth = scaled.width,
initialHeight = scaled.height,
startQuality = (quality * 100.0).roundToInt().coerceIn(10, 100),
maxBytes = maxEncodedBytes,
encode = { width, height, q ->
val bitmap =
if (width == scaled.width && height == scaled.height) {
scaled
} else {
scaled.scale(width, height)
}
val out = ByteArrayOutputStream()
if (!bitmap.compress(Bitmap.CompressFormat.JPEG, q, out)) {
if (bitmap !== scaled) bitmap.recycle()
throw IllegalStateException("UNAVAILABLE: failed to encode JPEG")
}
if (bitmap !== scaled) {
bitmap.recycle()
}
out.toByteArray()
},
)
val base64 = Base64.encodeToString(result.bytes, Base64.NO_WRAP)
Payload(
"""{"format":"jpg","base64":"$base64","width":${scaled.width},"height":${scaled.height}}""",
"""{"format":"jpg","base64":"$base64","width":${result.width},"height":${result.height}}""",
)
}

View File

@ -3,6 +3,7 @@ package com.steipete.clawdis.node.node
import android.graphics.Bitmap
import android.graphics.Canvas
import android.os.Looper
import android.util.Log
import android.webkit.WebView
import androidx.core.graphics.createBitmap
import androidx.core.graphics.scale
@ -16,6 +17,7 @@ import kotlinx.serialization.json.Json
import kotlinx.serialization.json.JsonElement
import kotlinx.serialization.json.JsonObject
import kotlinx.serialization.json.JsonPrimitive
import com.steipete.clawdis.node.BuildConfig
import kotlin.coroutines.resume
class CanvasController {
@ -81,8 +83,14 @@ class CanvasController {
val currentUrl = url
withWebViewOnMain { wv ->
if (currentUrl == null) {
if (BuildConfig.DEBUG) {
Log.d("ClawdisCanvas", "load scaffold: $scaffoldAssetUrl")
}
wv.loadUrl(scaffoldAssetUrl)
} else {
if (BuildConfig.DEBUG) {
Log.d("ClawdisCanvas", "load url: $currentUrl")
}
wv.loadUrl(currentUrl)
}
}

View File

@ -0,0 +1,61 @@
package com.steipete.clawdis.node.node
import kotlin.math.max
import kotlin.math.min
import kotlin.math.roundToInt
internal data class JpegSizeLimiterResult(
val bytes: ByteArray,
val width: Int,
val height: Int,
val quality: Int,
)
internal object JpegSizeLimiter {
fun compressToLimit(
initialWidth: Int,
initialHeight: Int,
startQuality: Int,
maxBytes: Int,
minQuality: Int = 20,
minSize: Int = 256,
scaleStep: Double = 0.85,
maxScaleAttempts: Int = 6,
maxQualityAttempts: Int = 6,
encode: (width: Int, height: Int, quality: Int) -> ByteArray,
): JpegSizeLimiterResult {
require(initialWidth > 0 && initialHeight > 0) { "Invalid image size" }
require(maxBytes > 0) { "Invalid maxBytes" }
var width = initialWidth
var height = initialHeight
val clampedStartQuality = startQuality.coerceIn(minQuality, 100)
var best = JpegSizeLimiterResult(bytes = encode(width, height, clampedStartQuality), width = width, height = height, quality = clampedStartQuality)
if (best.bytes.size <= maxBytes) return best
repeat(maxScaleAttempts) {
var quality = clampedStartQuality
repeat(maxQualityAttempts) {
val bytes = encode(width, height, quality)
best = JpegSizeLimiterResult(bytes = bytes, width = width, height = height, quality = quality)
if (bytes.size <= maxBytes) return best
if (quality <= minQuality) return@repeat
quality = max(minQuality, (quality * 0.75).roundToInt())
}
val minScale = (minSize.toDouble() / min(width, height).toDouble()).coerceAtMost(1.0)
val nextScale = max(scaleStep, minScale)
val nextWidth = max(minSize, (width * nextScale).roundToInt())
val nextHeight = max(minSize, (height * nextScale).roundToInt())
if (nextWidth == width && nextHeight == height) return@repeat
width = min(nextWidth, width)
height = min(nextHeight, height)
}
if (best.bytes.size > maxBytes) {
throw IllegalStateException("CAMERA_TOO_LARGE: ${best.bytes.size} bytes > $maxBytes bytes")
}
return best
}
}

View File

@ -1,64 +1,26 @@
package com.steipete.clawdis.node.ui
import androidx.compose.animation.AnimatedVisibility
import androidx.compose.animation.fadeIn
import androidx.compose.animation.fadeOut
import androidx.compose.animation.slideInVertically
import androidx.compose.animation.slideOutVertically
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.statusBarsPadding
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.CheckCircle
import androidx.compose.material.icons.filled.Error
import androidx.compose.material.icons.filled.FiberManualRecord
import androidx.compose.material.icons.filled.PhotoCamera
import androidx.compose.material3.CircularProgressIndicator
import androidx.compose.material3.Icon
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.LaunchedEffect
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableFloatStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.alpha
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.style.TextOverflow
import androidx.compose.ui.unit.dp
import com.steipete.clawdis.node.CameraHudKind
import com.steipete.clawdis.node.CameraHudState
import kotlinx.coroutines.delay
@Composable
fun CameraHudOverlay(
hud: CameraHudState?,
flashToken: Long,
fun CameraFlashOverlay(
token: Long,
modifier: Modifier = Modifier,
) {
Box(modifier = modifier.fillMaxSize()) {
CameraFlash(token = flashToken)
AnimatedVisibility(
visible = hud != null,
enter = slideInVertically(initialOffsetY = { -it / 2 }) + fadeIn(),
exit = slideOutVertically(targetOffsetY = { -it / 2 }) + fadeOut(),
modifier = Modifier.align(Alignment.TopStart).statusBarsPadding().padding(start = 12.dp, top = 58.dp),
) {
if (hud != null) {
Toast(hud = hud)
}
}
CameraFlash(token = token)
}
}
@ -80,44 +42,3 @@ private fun CameraFlash(token: Long) {
.background(Color.White),
)
}
@Composable
private fun Toast(hud: CameraHudState) {
Surface(
shape = RoundedCornerShape(14.dp),
color = MaterialTheme.colorScheme.surface.copy(alpha = 0.85f),
tonalElevation = 2.dp,
shadowElevation = 8.dp,
) {
Row(
modifier = Modifier.padding(vertical = 10.dp, horizontal = 12.dp),
verticalAlignment = Alignment.CenterVertically,
) {
when (hud.kind) {
CameraHudKind.Photo -> {
Icon(Icons.Default.PhotoCamera, contentDescription = null)
Spacer(Modifier.size(10.dp))
CircularProgressIndicator(modifier = Modifier.size(14.dp), strokeWidth = 2.dp)
}
CameraHudKind.Recording -> {
Icon(Icons.Default.FiberManualRecord, contentDescription = null, tint = Color.Red)
}
CameraHudKind.Success -> {
Icon(Icons.Default.CheckCircle, contentDescription = null)
}
CameraHudKind.Error -> {
Icon(Icons.Default.Error, contentDescription = null)
}
}
Spacer(Modifier.size(10.dp))
Text(
text = hud.message,
style = MaterialTheme.typography.bodyMedium,
maxLines = 1,
overflow = TextOverflow.Ellipsis,
)
}
}
}

View File

@ -7,12 +7,18 @@ import android.graphics.Color
import android.util.Log
import android.view.View
import android.webkit.JavascriptInterface
import android.webkit.ConsoleMessage
import android.webkit.WebChromeClient
import android.webkit.WebView
import android.webkit.WebSettings
import android.webkit.WebResourceError
import android.webkit.WebResourceRequest
import android.webkit.WebResourceResponse
import android.webkit.WebViewClient
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.webkit.WebSettingsCompat
import androidx.webkit.WebViewFeature
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
@ -28,10 +34,20 @@ import androidx.compose.material3.ExperimentalMaterial3Api
import androidx.compose.material3.FilledTonalIconButton
import androidx.compose.material3.Icon
import androidx.compose.material3.IconButtonDefaults
import androidx.compose.material3.LocalContentColor
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.ModalBottomSheet
import androidx.compose.material3.rememberModalBottomSheetState
import androidx.compose.material.icons.Icons
import androidx.compose.material.icons.filled.ChatBubble
import androidx.compose.material.icons.filled.CheckCircle
import androidx.compose.material.icons.filled.Error
import androidx.compose.material.icons.filled.FiberManualRecord
import androidx.compose.material.icons.filled.PhotoCamera
import androidx.compose.material.icons.filled.RecordVoiceOver
import androidx.compose.material.icons.filled.Refresh
import androidx.compose.material.icons.filled.Report
import androidx.compose.material.icons.filled.ScreenShare
import androidx.compose.material.icons.filled.Settings
import androidx.compose.runtime.Composable
import androidx.compose.runtime.collectAsState
@ -41,12 +57,15 @@ import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.graphics.Color as ComposeColor
import androidx.compose.ui.graphics.lerp
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.unit.dp
import androidx.compose.ui.viewinterop.AndroidView
import androidx.compose.ui.window.Popup
import androidx.compose.ui.window.PopupProperties
import androidx.core.content.ContextCompat
import com.steipete.clawdis.node.CameraHudKind
import com.steipete.clawdis.node.MainViewModel
@OptIn(ExperimentalMaterial3Api::class)
@ -60,6 +79,105 @@ fun RootScreen(viewModel: MainViewModel) {
val statusText by viewModel.statusText.collectAsState()
val cameraHud by viewModel.cameraHud.collectAsState()
val cameraFlashToken by viewModel.cameraFlashToken.collectAsState()
val screenRecordActive by viewModel.screenRecordActive.collectAsState()
val isForeground by viewModel.isForeground.collectAsState()
val voiceWakeStatusText by viewModel.voiceWakeStatusText.collectAsState()
val talkEnabled by viewModel.talkEnabled.collectAsState()
val talkStatusText by viewModel.talkStatusText.collectAsState()
val talkIsListening by viewModel.talkIsListening.collectAsState()
val talkIsSpeaking by viewModel.talkIsSpeaking.collectAsState()
val seamColorArgb by viewModel.seamColorArgb.collectAsState()
val seamColor = remember(seamColorArgb) { ComposeColor(seamColorArgb) }
val audioPermissionLauncher =
rememberLauncherForActivityResult(ActivityResultContracts.RequestPermission()) { granted ->
if (granted) viewModel.setTalkEnabled(true)
}
val activity =
remember(cameraHud, screenRecordActive, isForeground, statusText, voiceWakeStatusText) {
// Status pill owns transient activity state so it doesn't overlap the connection indicator.
if (!isForeground) {
return@remember StatusActivity(
title = "Foreground required",
icon = Icons.Default.Report,
contentDescription = "Foreground required",
)
}
val lowerStatus = statusText.lowercase()
if (lowerStatus.contains("repair")) {
return@remember StatusActivity(
title = "Repairing…",
icon = Icons.Default.Refresh,
contentDescription = "Repairing",
)
}
if (lowerStatus.contains("pairing") || lowerStatus.contains("approval")) {
return@remember StatusActivity(
title = "Approval pending",
icon = Icons.Default.RecordVoiceOver,
contentDescription = "Approval pending",
)
}
// Avoid duplicating the primary bridge status ("Connecting…") in the activity slot.
if (screenRecordActive) {
return@remember StatusActivity(
title = "Recording screen…",
icon = Icons.Default.ScreenShare,
contentDescription = "Recording screen",
tint = androidx.compose.ui.graphics.Color.Red,
)
}
cameraHud?.let { hud ->
return@remember when (hud.kind) {
CameraHudKind.Photo ->
StatusActivity(
title = hud.message,
icon = Icons.Default.PhotoCamera,
contentDescription = "Taking photo",
)
CameraHudKind.Recording ->
StatusActivity(
title = hud.message,
icon = Icons.Default.FiberManualRecord,
contentDescription = "Recording",
tint = androidx.compose.ui.graphics.Color.Red,
)
CameraHudKind.Success ->
StatusActivity(
title = hud.message,
icon = Icons.Default.CheckCircle,
contentDescription = "Capture finished",
)
CameraHudKind.Error ->
StatusActivity(
title = hud.message,
icon = Icons.Default.Error,
contentDescription = "Capture failed",
tint = androidx.compose.ui.graphics.Color.Red,
)
}
}
if (voiceWakeStatusText.contains("Microphone permission", ignoreCase = true)) {
return@remember StatusActivity(
title = "Mic permission",
icon = Icons.Default.Error,
contentDescription = "Mic permission required",
)
}
if (voiceWakeStatusText == "Paused") {
val suffix = if (!isForeground) " (background)" else ""
return@remember StatusActivity(
title = "Voice Wake paused$suffix",
icon = Icons.Default.RecordVoiceOver,
contentDescription = "Voice Wake paused",
)
}
null
}
val bridgeState =
remember(serverName, statusText) {
@ -80,9 +198,9 @@ fun RootScreen(viewModel: MainViewModel) {
CanvasView(viewModel = viewModel, modifier = Modifier.fillMaxSize())
}
// Camera HUD (flash + toast) must be in a Popup to render above the WebView.
// Camera flash must be in a Popup to render above the WebView.
Popup(alignment = Alignment.Center, properties = PopupProperties(focusable = false)) {
CameraHudOverlay(hud = cameraHud, flashToken = cameraFlashToken, modifier = Modifier.fillMaxSize())
CameraFlashOverlay(token = cameraFlashToken, modifier = Modifier.fillMaxSize())
}
// Keep the overlay buttons above the WebView canvas (AndroidView), otherwise they may not receive touches.
@ -90,6 +208,7 @@ fun RootScreen(viewModel: MainViewModel) {
StatusPill(
bridge = bridgeState,
voiceEnabled = voiceEnabled,
activity = activity,
onClick = { sheet = Sheet.Settings },
modifier = Modifier.windowInsetsPadding(safeOverlayInsets).padding(start = 12.dp, top = 12.dp),
)
@ -106,6 +225,38 @@ fun RootScreen(viewModel: MainViewModel) {
icon = { Icon(Icons.Default.ChatBubble, contentDescription = "Chat") },
)
// Talk mode gets a dedicated side bubble instead of burying it in settings.
val baseOverlay = overlayContainerColor()
val talkContainer =
lerp(
baseOverlay,
seamColor.copy(alpha = baseOverlay.alpha),
if (talkEnabled) 0.35f else 0.22f,
)
val talkContent = if (talkEnabled) seamColor else overlayIconColor()
OverlayIconButton(
onClick = {
val next = !talkEnabled
if (next) {
val micOk =
ContextCompat.checkSelfPermission(context, Manifest.permission.RECORD_AUDIO) ==
PackageManager.PERMISSION_GRANTED
if (!micOk) audioPermissionLauncher.launch(Manifest.permission.RECORD_AUDIO)
viewModel.setTalkEnabled(true)
} else {
viewModel.setTalkEnabled(false)
}
},
containerColor = talkContainer,
contentColor = talkContent,
icon = {
Icon(
Icons.Default.RecordVoiceOver,
contentDescription = "Talk Mode",
)
},
)
OverlayIconButton(
onClick = { sheet = Sheet.Settings },
icon = { Icon(Icons.Default.Settings, contentDescription = "Settings") },
@ -113,6 +264,17 @@ fun RootScreen(viewModel: MainViewModel) {
}
}
if (talkEnabled) {
Popup(alignment = Alignment.Center, properties = PopupProperties(focusable = false)) {
TalkOrbOverlay(
seamColor = seamColor,
statusText = talkStatusText,
isListening = talkIsListening,
isSpeaking = talkIsSpeaking,
)
}
}
val currentSheet = sheet
if (currentSheet != null) {
ModalBottomSheet(
@ -136,14 +298,16 @@ private enum class Sheet {
private fun OverlayIconButton(
onClick: () -> Unit,
icon: @Composable () -> Unit,
containerColor: ComposeColor? = null,
contentColor: ComposeColor? = null,
) {
FilledTonalIconButton(
onClick = onClick,
modifier = Modifier.size(44.dp),
colors =
IconButtonDefaults.filledTonalIconButtonColors(
containerColor = overlayContainerColor(),
contentColor = overlayIconColor(),
containerColor = containerColor ?: overlayContainerColor(),
contentColor = contentColor ?: overlayIconColor(),
),
) {
icon()
@ -163,6 +327,19 @@ private fun CanvasView(viewModel: MainViewModel, modifier: Modifier = Modifier)
// Some embedded web UIs (incl. the "background website") use localStorage/sessionStorage.
settings.domStorageEnabled = true
settings.mixedContentMode = WebSettings.MIXED_CONTENT_COMPATIBILITY_MODE
if (WebViewFeature.isFeatureSupported(WebViewFeature.FORCE_DARK)) {
WebSettingsCompat.setForceDark(settings, WebSettingsCompat.FORCE_DARK_OFF)
}
if (WebViewFeature.isFeatureSupported(WebViewFeature.ALGORITHMIC_DARKENING)) {
WebSettingsCompat.setAlgorithmicDarkeningAllowed(settings, false)
}
if (isDebuggable) {
Log.d("ClawdisWebView", "userAgent: ${settings.userAgentString}")
}
isScrollContainer = true
overScrollMode = View.OVER_SCROLL_IF_CONTENT_SCROLLS
isVerticalScrollBarEnabled = true
isHorizontalScrollBarEnabled = true
webViewClient =
object : WebViewClient() {
override fun onReceivedError(
@ -189,11 +366,38 @@ private fun CanvasView(viewModel: MainViewModel, modifier: Modifier = Modifier)
}
override fun onPageFinished(view: WebView, url: String?) {
if (isDebuggable) {
Log.d("ClawdisWebView", "onPageFinished: $url")
}
viewModel.canvas.onPageFinished()
}
override fun onRenderProcessGone(
view: WebView,
detail: android.webkit.RenderProcessGoneDetail,
): Boolean {
if (isDebuggable) {
Log.e(
"ClawdisWebView",
"onRenderProcessGone didCrash=${detail.didCrash()} priorityAtExit=${detail.rendererPriorityAtExit()}",
)
}
return true
}
}
setBackgroundColor(Color.BLACK)
setLayerType(View.LAYER_TYPE_HARDWARE, null)
webChromeClient =
object : WebChromeClient() {
override fun onConsoleMessage(consoleMessage: ConsoleMessage?): Boolean {
if (!isDebuggable) return false
val msg = consoleMessage ?: return false
Log.d(
"ClawdisWebView",
"console ${msg.messageLevel()} @ ${msg.sourceId()}:${msg.lineNumber()} ${msg.message()}",
)
return false
}
}
// Use default layer/background; avoid forcing a black fill over WebView content.
val a2uiBridge =
CanvasA2UIActionBridge { payload ->

View File

@ -2,6 +2,7 @@ package com.steipete.clawdis.node.ui
import android.Manifest
import android.content.pm.PackageManager
import android.os.Build
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.animation.AnimatedVisibility
@ -46,6 +47,7 @@ import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.core.content.ContextCompat
import com.steipete.clawdis.node.BuildConfig
import com.steipete.clawdis.node.MainViewModel
import com.steipete.clawdis.node.NodeForegroundService
import com.steipete.clawdis.node.VoiceWakeMode
@ -74,6 +76,22 @@ fun SettingsSheet(viewModel: MainViewModel) {
val listState = rememberLazyListState()
val (wakeWordsText, setWakeWordsText) = remember { mutableStateOf("") }
val (advancedExpanded, setAdvancedExpanded) = remember { mutableStateOf(false) }
val deviceModel =
remember {
listOfNotNull(Build.MANUFACTURER, Build.MODEL)
.joinToString(" ")
.trim()
.ifEmpty { "Android" }
}
val appVersion =
remember {
val versionName = BuildConfig.VERSION_NAME.trim().ifEmpty { "dev" }
if (BuildConfig.DEBUG && !versionName.contains("dev", ignoreCase = true)) {
"$versionName-dev"
} else {
versionName
}
}
LaunchedEffect(wakeWords) { setWakeWordsText(wakeWords.joinToString(", ")) }
@ -142,6 +160,8 @@ fun SettingsSheet(viewModel: MainViewModel) {
)
}
item { Text("Instance ID: $instanceId", color = MaterialTheme.colorScheme.onSurfaceVariant) }
item { Text("Device: $deviceModel", color = MaterialTheme.colorScheme.onSurfaceVariant) }
item { Text("Version: $appVersion", color = MaterialTheme.colorScheme.onSurfaceVariant) }
item { HorizontalDivider() }
@ -181,9 +201,27 @@ fun SettingsSheet(viewModel: MainViewModel) {
item { Text("No bridges found yet.", color = MaterialTheme.colorScheme.onSurfaceVariant) }
} else {
items(items = visibleBridges, key = { it.stableId }) { bridge ->
val detailLines =
buildList {
add("IP: ${bridge.host}:${bridge.port}")
bridge.lanHost?.let { add("LAN: $it") }
bridge.tailnetDns?.let { add("Tailnet: $it") }
if (bridge.gatewayPort != null || bridge.bridgePort != null || bridge.canvasPort != null) {
val gw = bridge.gatewayPort?.toString() ?: ""
val br = (bridge.bridgePort ?: bridge.port).toString()
val canvas = bridge.canvasPort?.toString() ?: ""
add("Ports: gw $gw · bridge $br · canvas $canvas")
}
}
ListItem(
headlineContent = { Text(bridge.name) },
supportingContent = { Text("${bridge.host}:${bridge.port}") },
supportingContent = {
Column(verticalArrangement = Arrangement.spacedBy(2.dp)) {
detailLines.forEach { line ->
Text(line, color = MaterialTheme.colorScheme.onSurfaceVariant)
}
}
},
trailingContent = {
Button(
onClick = {

View File

@ -28,6 +28,7 @@ import androidx.compose.ui.unit.dp
fun StatusPill(
bridge: BridgeState,
voiceEnabled: Boolean,
activity: StatusActivity? = null,
onClick: () -> Unit,
modifier: Modifier = Modifier,
) {
@ -62,23 +63,49 @@ fun StatusPill(
color = MaterialTheme.colorScheme.onSurfaceVariant,
)
Icon(
imageVector = if (voiceEnabled) Icons.Default.Mic else Icons.Default.MicOff,
contentDescription = if (voiceEnabled) "Voice enabled" else "Voice disabled",
tint =
if (voiceEnabled) {
overlayIconColor()
} else {
MaterialTheme.colorScheme.onSurfaceVariant
},
modifier = Modifier.size(18.dp),
)
if (activity != null) {
Row(
horizontalArrangement = Arrangement.spacedBy(6.dp),
verticalAlignment = Alignment.CenterVertically,
) {
Icon(
imageVector = activity.icon,
contentDescription = activity.contentDescription,
tint = activity.tint ?: overlayIconColor(),
modifier = Modifier.size(18.dp),
)
Text(
text = activity.title,
style = MaterialTheme.typography.labelLarge,
maxLines = 1,
)
}
} else {
Icon(
imageVector = if (voiceEnabled) Icons.Default.Mic else Icons.Default.MicOff,
contentDescription = if (voiceEnabled) "Voice enabled" else "Voice disabled",
tint =
if (voiceEnabled) {
overlayIconColor()
} else {
MaterialTheme.colorScheme.onSurfaceVariant
},
modifier = Modifier.size(18.dp),
)
}
Spacer(modifier = Modifier.width(2.dp))
}
}
}
data class StatusActivity(
val title: String,
val icon: androidx.compose.ui.graphics.vector.ImageVector,
val contentDescription: String,
val tint: Color? = null,
)
enum class BridgeState(val title: String, val color: Color) {
Connected("Connected", Color(0xFF2ECC71)),
Connecting("Connecting…", Color(0xFFF1C40F)),

View File

@ -0,0 +1,134 @@
package com.steipete.clawdis.node.ui
import androidx.compose.animation.core.LinearEasing
import androidx.compose.animation.core.RepeatMode
import androidx.compose.animation.core.animateFloat
import androidx.compose.animation.core.infiniteRepeatable
import androidx.compose.animation.core.rememberInfiniteTransition
import androidx.compose.animation.core.tween
import androidx.compose.foundation.Canvas
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.getValue
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.graphics.drawscope.Stroke
import androidx.compose.ui.text.font.FontWeight
import androidx.compose.ui.unit.dp
@Composable
fun TalkOrbOverlay(
seamColor: Color,
statusText: String,
isListening: Boolean,
isSpeaking: Boolean,
modifier: Modifier = Modifier,
) {
val transition = rememberInfiniteTransition(label = "talk-orb")
val t by
transition.animateFloat(
initialValue = 0f,
targetValue = 1f,
animationSpec =
infiniteRepeatable(
animation = tween(durationMillis = 1500, easing = LinearEasing),
repeatMode = RepeatMode.Restart,
),
label = "pulse",
)
val trimmed = statusText.trim()
val showStatus = trimmed.isNotEmpty() && trimmed != "Off"
val phase =
when {
isSpeaking -> "Speaking"
isListening -> "Listening"
else -> "Thinking"
}
Column(
modifier = modifier.padding(24.dp),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.spacedBy(12.dp),
) {
Box(contentAlignment = Alignment.Center) {
Canvas(modifier = Modifier.size(360.dp)) {
val center = this.center
val baseRadius = size.minDimension * 0.30f
val ring1 = 1.05f + (t * 0.25f)
val ring2 = 1.20f + (t * 0.55f)
val ringAlpha1 = (1f - t) * 0.34f
val ringAlpha2 = (1f - t) * 0.22f
drawCircle(
color = seamColor.copy(alpha = ringAlpha1),
radius = baseRadius * ring1,
center = center,
style = Stroke(width = 3.dp.toPx()),
)
drawCircle(
color = seamColor.copy(alpha = ringAlpha2),
radius = baseRadius * ring2,
center = center,
style = Stroke(width = 3.dp.toPx()),
)
drawCircle(
brush =
Brush.radialGradient(
colors =
listOf(
seamColor.copy(alpha = 0.92f),
seamColor.copy(alpha = 0.40f),
Color.Black.copy(alpha = 0.56f),
),
center = center,
radius = baseRadius * 1.35f,
),
radius = baseRadius,
center = center,
)
drawCircle(
color = seamColor.copy(alpha = 0.34f),
radius = baseRadius,
center = center,
style = Stroke(width = 1.dp.toPx()),
)
}
}
if (showStatus) {
Surface(
color = Color.Black.copy(alpha = 0.40f),
shape = CircleShape,
) {
Text(
text = trimmed,
modifier = Modifier.padding(horizontal = 14.dp, vertical = 8.dp),
color = Color.White.copy(alpha = 0.92f),
style = MaterialTheme.typography.labelLarge,
fontWeight = FontWeight.SemiBold,
)
}
} else {
Text(
text = phase,
color = Color.White.copy(alpha = 0.80f),
style = MaterialTheme.typography.labelLarge,
fontWeight = FontWeight.SemiBold,
)
}
}
}

View File

@ -17,6 +17,7 @@ import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.ui.Modifier
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.graphics.asImageBitmap
import androidx.compose.ui.layout.ContentScale
import androidx.compose.ui.text.AnnotatedString
@ -31,7 +32,7 @@ import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.withContext
@Composable
fun ChatMarkdown(text: String) {
fun ChatMarkdown(text: String, textColor: Color) {
val blocks = remember(text) { splitMarkdown(text) }
val inlineCodeBg = MaterialTheme.colorScheme.surfaceContainerLow
@ -44,7 +45,7 @@ fun ChatMarkdown(text: String) {
Text(
text = parseInlineMarkdown(trimmed, inlineCodeBg = inlineCodeBg),
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurface,
color = textColor,
)
}
is ChatMarkdownBlock.Code -> {

View File

@ -7,11 +7,9 @@ import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Row
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.layout.width
import androidx.compose.foundation.shape.CircleShape
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.material3.MaterialTheme
@ -60,20 +58,21 @@ fun ChatMessageBubble(message: ChatMessage) {
.background(bubbleBackground(isUser))
.padding(horizontal = 12.dp, vertical = 10.dp),
) {
ChatMessageBody(content = message.content)
val textColor = textColorOverBubble(isUser)
ChatMessageBody(content = message.content, textColor = textColor)
}
}
}
}
@Composable
private fun ChatMessageBody(content: List<ChatMessageContent>) {
private fun ChatMessageBody(content: List<ChatMessageContent>, textColor: Color) {
Column(verticalArrangement = Arrangement.spacedBy(10.dp)) {
for (part in content) {
when (part.type) {
"text" -> {
val text = part.text ?: continue
ChatMarkdown(text = text)
ChatMarkdown(text = text, textColor = textColor)
}
else -> {
val b64 = part.base64 ?: continue
@ -131,7 +130,7 @@ fun ChatStreamingAssistantBubble(text: String) {
color = MaterialTheme.colorScheme.surfaceContainer,
) {
Box(modifier = Modifier.padding(horizontal = 12.dp, vertical = 10.dp)) {
ChatMarkdown(text = text)
ChatMarkdown(text = text, textColor = MaterialTheme.colorScheme.onSurface)
}
}
}
@ -150,6 +149,15 @@ private fun bubbleBackground(isUser: Boolean): Brush {
}
}
@Composable
private fun textColorOverBubble(isUser: Boolean): Color {
return if (isUser) {
MaterialTheme.colorScheme.onPrimary
} else {
MaterialTheme.colorScheme.onSurface
}
}
@Composable
private fun ChatBase64Image(base64: String, mimeType: String?) {
var image by remember(base64) { mutableStateOf<androidx.compose.ui.graphics.ImageBitmap?>(null) }

View File

@ -0,0 +1,98 @@
package com.steipete.clawdis.node.voice
import android.media.MediaDataSource
import kotlin.math.min
internal class StreamingMediaDataSource : MediaDataSource() {
private data class Chunk(val start: Long, val data: ByteArray)
private val lock = Object()
private val chunks = ArrayList<Chunk>()
private var totalSize: Long = 0
private var closed = false
private var finished = false
private var lastReadIndex = 0
fun append(data: ByteArray) {
if (data.isEmpty()) return
synchronized(lock) {
if (closed || finished) return
val chunk = Chunk(totalSize, data)
chunks.add(chunk)
totalSize += data.size.toLong()
lock.notifyAll()
}
}
fun finish() {
synchronized(lock) {
if (closed) return
finished = true
lock.notifyAll()
}
}
fun fail() {
synchronized(lock) {
closed = true
lock.notifyAll()
}
}
override fun readAt(position: Long, buffer: ByteArray, offset: Int, size: Int): Int {
if (position < 0) return -1
synchronized(lock) {
while (!closed && !finished && position >= totalSize) {
lock.wait()
}
if (closed) return -1
if (position >= totalSize && finished) return -1
val available = (totalSize - position).toInt()
val toRead = min(size, available)
var remaining = toRead
var destOffset = offset
var pos = position
var index = findChunkIndex(pos)
while (remaining > 0 && index < chunks.size) {
val chunk = chunks[index]
val inChunkOffset = (pos - chunk.start).toInt()
if (inChunkOffset >= chunk.data.size) {
index++
continue
}
val copyLen = min(remaining, chunk.data.size - inChunkOffset)
System.arraycopy(chunk.data, inChunkOffset, buffer, destOffset, copyLen)
remaining -= copyLen
destOffset += copyLen
pos += copyLen
if (inChunkOffset + copyLen >= chunk.data.size) {
index++
}
}
return toRead - remaining
}
}
override fun getSize(): Long = -1
override fun close() {
synchronized(lock) {
closed = true
lock.notifyAll()
}
}
private fun findChunkIndex(position: Long): Int {
var index = lastReadIndex
while (index < chunks.size) {
val chunk = chunks[index]
if (position < chunk.start + chunk.data.size) break
index++
}
lastReadIndex = index
return index
}
}

View File

@ -0,0 +1,191 @@
package com.steipete.clawdis.node.voice
import kotlinx.serialization.json.Json
import kotlinx.serialization.json.JsonElement
import kotlinx.serialization.json.JsonObject
import kotlinx.serialization.json.JsonPrimitive
private val directiveJson = Json { ignoreUnknownKeys = true }
data class TalkDirective(
val voiceId: String? = null,
val modelId: String? = null,
val speed: Double? = null,
val rateWpm: Int? = null,
val stability: Double? = null,
val similarity: Double? = null,
val style: Double? = null,
val speakerBoost: Boolean? = null,
val seed: Long? = null,
val normalize: String? = null,
val language: String? = null,
val outputFormat: String? = null,
val latencyTier: Int? = null,
val once: Boolean? = null,
)
data class TalkDirectiveParseResult(
val directive: TalkDirective?,
val stripped: String,
val unknownKeys: List<String>,
)
object TalkDirectiveParser {
fun parse(text: String): TalkDirectiveParseResult {
val normalized = text.replace("\r\n", "\n")
val lines = normalized.split("\n").toMutableList()
if (lines.isEmpty()) return TalkDirectiveParseResult(null, text, emptyList())
val firstNonEmpty = lines.indexOfFirst { it.trim().isNotEmpty() }
if (firstNonEmpty == -1) return TalkDirectiveParseResult(null, text, emptyList())
val head = lines[firstNonEmpty].trim()
if (!head.startsWith("{") || !head.endsWith("}")) {
return TalkDirectiveParseResult(null, text, emptyList())
}
val obj = parseJsonObject(head) ?: return TalkDirectiveParseResult(null, text, emptyList())
val speakerBoost =
boolValue(obj, listOf("speaker_boost", "speakerBoost"))
?: boolValue(obj, listOf("no_speaker_boost", "noSpeakerBoost"))?.not()
val directive = TalkDirective(
voiceId = stringValue(obj, listOf("voice", "voice_id", "voiceId")),
modelId = stringValue(obj, listOf("model", "model_id", "modelId")),
speed = doubleValue(obj, listOf("speed")),
rateWpm = intValue(obj, listOf("rate", "wpm")),
stability = doubleValue(obj, listOf("stability")),
similarity = doubleValue(obj, listOf("similarity", "similarity_boost", "similarityBoost")),
style = doubleValue(obj, listOf("style")),
speakerBoost = speakerBoost,
seed = longValue(obj, listOf("seed")),
normalize = stringValue(obj, listOf("normalize", "apply_text_normalization")),
language = stringValue(obj, listOf("lang", "language_code", "language")),
outputFormat = stringValue(obj, listOf("output_format", "format")),
latencyTier = intValue(obj, listOf("latency", "latency_tier", "latencyTier")),
once = boolValue(obj, listOf("once")),
)
val hasDirective = listOf(
directive.voiceId,
directive.modelId,
directive.speed,
directive.rateWpm,
directive.stability,
directive.similarity,
directive.style,
directive.speakerBoost,
directive.seed,
directive.normalize,
directive.language,
directive.outputFormat,
directive.latencyTier,
directive.once,
).any { it != null }
if (!hasDirective) return TalkDirectiveParseResult(null, text, emptyList())
val knownKeys = setOf(
"voice", "voice_id", "voiceid",
"model", "model_id", "modelid",
"speed", "rate", "wpm",
"stability", "similarity", "similarity_boost", "similarityboost",
"style",
"speaker_boost", "speakerboost",
"no_speaker_boost", "nospeakerboost",
"seed",
"normalize", "apply_text_normalization",
"lang", "language_code", "language",
"output_format", "format",
"latency", "latency_tier", "latencytier",
"once",
)
val unknownKeys = obj.keys.filter { !knownKeys.contains(it.lowercase()) }.sorted()
lines.removeAt(firstNonEmpty)
if (firstNonEmpty < lines.size) {
if (lines[firstNonEmpty].trim().isEmpty()) {
lines.removeAt(firstNonEmpty)
}
}
return TalkDirectiveParseResult(directive, lines.joinToString("\n"), unknownKeys)
}
private fun parseJsonObject(line: String): JsonObject? {
return try {
directiveJson.parseToJsonElement(line) as? JsonObject
} catch (_: Throwable) {
null
}
}
private fun stringValue(obj: JsonObject, keys: List<String>): String? {
for (key in keys) {
val value = obj[key].asStringOrNull()?.trim()
if (!value.isNullOrEmpty()) return value
}
return null
}
private fun doubleValue(obj: JsonObject, keys: List<String>): Double? {
for (key in keys) {
val value = obj[key].asDoubleOrNull()
if (value != null) return value
}
return null
}
private fun intValue(obj: JsonObject, keys: List<String>): Int? {
for (key in keys) {
val value = obj[key].asIntOrNull()
if (value != null) return value
}
return null
}
private fun longValue(obj: JsonObject, keys: List<String>): Long? {
for (key in keys) {
val value = obj[key].asLongOrNull()
if (value != null) return value
}
return null
}
private fun boolValue(obj: JsonObject, keys: List<String>): Boolean? {
for (key in keys) {
val value = obj[key].asBooleanOrNull()
if (value != null) return value
}
return null
}
}
private fun JsonElement?.asStringOrNull(): String? =
(this as? JsonPrimitive)?.takeIf { it.isString }?.content
private fun JsonElement?.asDoubleOrNull(): Double? {
val primitive = this as? JsonPrimitive ?: return null
return primitive.content.toDoubleOrNull()
}
private fun JsonElement?.asIntOrNull(): Int? {
val primitive = this as? JsonPrimitive ?: return null
return primitive.content.toIntOrNull()
}
private fun JsonElement?.asLongOrNull(): Long? {
val primitive = this as? JsonPrimitive ?: return null
return primitive.content.toLongOrNull()
}
private fun JsonElement?.asBooleanOrNull(): Boolean? {
val primitive = this as? JsonPrimitive ?: return null
val content = primitive.content.trim().lowercase()
return when (content) {
"true", "yes", "1" -> true
"false", "no", "0" -> false
else -> null
}
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,14 @@
package com.steipete.clawdis.node.bridge
import io.kotest.core.spec.style.StringSpec
import io.kotest.matchers.shouldBe
class BridgeEndpointKotestTest : StringSpec({
"manual endpoint builds stable id + name" {
val endpoint = BridgeEndpoint.manual("10.0.0.5", 18790)
endpoint.stableId shouldBe "manual|10.0.0.5|18790"
endpoint.name shouldBe "10.0.0.5:18790"
endpoint.host shouldBe "10.0.0.5"
endpoint.port shouldBe 18790
}
})

View File

@ -0,0 +1,47 @@
package com.steipete.clawdis.node.node
import org.junit.Assert.assertEquals
import org.junit.Assert.assertTrue
import org.junit.Test
import kotlin.math.min
class JpegSizeLimiterTest {
@Test
fun compressesLargePayloadsUnderLimit() {
val maxBytes = 5 * 1024 * 1024
val result =
JpegSizeLimiter.compressToLimit(
initialWidth = 4000,
initialHeight = 3000,
startQuality = 95,
maxBytes = maxBytes,
encode = { width, height, quality ->
val estimated = (width.toLong() * height.toLong() * quality.toLong()) / 100
val size = min(maxBytes.toLong() * 2, estimated).toInt()
ByteArray(size)
},
)
assertTrue(result.bytes.size <= maxBytes)
assertTrue(result.width <= 4000)
assertTrue(result.height <= 3000)
assertTrue(result.quality <= 95)
}
@Test
fun keepsSmallPayloadsAsIs() {
val maxBytes = 5 * 1024 * 1024
val result =
JpegSizeLimiter.compressToLimit(
initialWidth = 800,
initialHeight = 600,
startQuality = 90,
maxBytes = maxBytes,
encode = { _, _, _ -> ByteArray(120_000) },
)
assertEquals(800, result.width)
assertEquals(600, result.height)
assertEquals(90, result.quality)
}
}

View File

@ -0,0 +1,55 @@
package com.steipete.clawdis.node.voice
import org.junit.Assert.assertEquals
import org.junit.Assert.assertNull
import org.junit.Assert.assertTrue
import org.junit.Test
class TalkDirectiveParserTest {
@Test
fun parsesDirectiveAndStripsHeader() {
val input = """
{"voice":"voice-123","once":true}
Hello from talk mode.
""".trimIndent()
val result = TalkDirectiveParser.parse(input)
assertEquals("voice-123", result.directive?.voiceId)
assertEquals(true, result.directive?.once)
assertEquals("Hello from talk mode.", result.stripped.trim())
}
@Test
fun ignoresUnknownKeysButReportsThem() {
val input = """
{"voice":"abc","foo":1,"bar":"baz"}
Hi there.
""".trimIndent()
val result = TalkDirectiveParser.parse(input)
assertEquals("abc", result.directive?.voiceId)
assertTrue(result.unknownKeys.containsAll(listOf("bar", "foo")))
}
@Test
fun parsesAlternateKeys() {
val input = """
{"model_id":"eleven_v3","similarity_boost":0.4,"no_speaker_boost":true,"rate":200}
Speak.
""".trimIndent()
val result = TalkDirectiveParser.parse(input)
assertEquals("eleven_v3", result.directive?.modelId)
assertEquals(0.4, result.directive?.similarity)
assertEquals(false, result.directive?.speakerBoost)
assertEquals(200, result.directive?.rateWpm)
}
@Test
fun returnsNullWhenNoDirectivePresent() {
val input = """
{}
Hello.
""".trimIndent()
val result = TalkDirectiveParser.parse(input)
assertNull(result.directive)
assertEquals(input, result.stripped)
}
}

View File

@ -6,6 +6,15 @@ import Observation
import SwiftUI
import UIKit
protocol BridgePairingClient: Sendable {
func pairAndHello(
endpoint: NWEndpoint,
hello: BridgeHello,
onStatus: (@Sendable (String) -> Void)?) async throws -> String
}
extension BridgeClient: BridgePairingClient {}
@MainActor
@Observable
final class BridgeConnectionController {
@ -16,10 +25,16 @@ final class BridgeConnectionController {
private let discovery = BridgeDiscoveryModel()
private weak var appModel: NodeAppModel?
private var didAutoConnect = false
private var seenStableIDs = Set<String>()
init(appModel: NodeAppModel, startDiscovery: Bool = true) {
private let bridgeClientFactory: @Sendable () -> any BridgePairingClient
init(
appModel: NodeAppModel,
startDiscovery: Bool = true,
bridgeClientFactory: @escaping @Sendable () -> any BridgePairingClient = { BridgeClient() })
{
self.appModel = appModel
self.bridgeClientFactory = bridgeClientFactory
BridgeSettingsStore.bootstrapPersistence()
let defaults = UserDefaults.standard
@ -85,7 +100,7 @@ final class BridgeConnectionController {
let token = KeychainStore.loadString(
service: "com.steipete.clawdis.bridge",
account: "bridge-token.\(instanceId)")?
account: self.keychainAccount(instanceId: instanceId))?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
guard !token.isEmpty else { return }
@ -99,28 +114,40 @@ final class BridgeConnectionController {
guard let port = NWEndpoint.Port(rawValue: UInt16(resolvedPort)) else { return }
self.didAutoConnect = true
appModel.connectToBridge(
endpoint: .hostPort(host: NWEndpoint.Host(manualHost), port: port),
hello: self.makeHello(token: token))
let endpoint = NWEndpoint.hostPort(host: NWEndpoint.Host(manualHost), port: port)
self.startAutoConnect(endpoint: endpoint, token: token, instanceId: instanceId)
return
}
let targetStableID = defaults.string(forKey: "bridge.lastDiscoveredStableID")?
let preferredStableID = defaults.string(forKey: "bridge.preferredStableID")?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
guard !targetStableID.isEmpty else { return }
let lastDiscoveredStableID = defaults.string(forKey: "bridge.lastDiscoveredStableID")?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
let candidates = [preferredStableID, lastDiscoveredStableID].filter { !$0.isEmpty }
guard let targetStableID = candidates.first(where: { id in
self.bridges.contains(where: { $0.stableID == id })
}) else { return }
guard let target = self.bridges.first(where: { $0.stableID == targetStableID }) else { return }
self.didAutoConnect = true
appModel.connectToBridge(endpoint: target.endpoint, hello: self.makeHello(token: token))
self.startAutoConnect(endpoint: target.endpoint, token: token, instanceId: instanceId)
}
private func updateLastDiscoveredBridge(from bridges: [BridgeDiscoveryModel.DiscoveredBridge]) {
let newlyDiscovered = bridges.filter { self.seenStableIDs.insert($0.stableID).inserted }
guard let last = newlyDiscovered.last else { return }
let defaults = UserDefaults.standard
let preferred = defaults.string(forKey: "bridge.preferredStableID")?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
let existingLast = defaults.string(forKey: "bridge.lastDiscoveredStableID")?
.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
UserDefaults.standard.set(last.stableID, forKey: "bridge.lastDiscoveredStableID")
BridgeSettingsStore.saveLastDiscoveredBridgeStableID(last.stableID)
// Avoid overriding user intent (preferred/lastDiscovered are also set on manual Connect).
guard preferred.isEmpty, existingLast.isEmpty else { return }
guard let first = bridges.first else { return }
defaults.set(first.stableID, forKey: "bridge.lastDiscoveredStableID")
BridgeSettingsStore.saveLastDiscoveredBridgeStableID(first.stableID)
}
private func makeHello(token: String) -> BridgeHello {
@ -140,6 +167,40 @@ final class BridgeConnectionController {
commands: self.currentCommands())
}
private func keychainAccount(instanceId: String) -> String {
"bridge-token.\(instanceId)"
}
private func startAutoConnect(endpoint: NWEndpoint, token: String, instanceId: String) {
guard let appModel else { return }
Task { [weak self] in
guard let self else { return }
do {
let hello = self.makeHello(token: token)
let refreshed = try await self.bridgeClientFactory().pairAndHello(
endpoint: endpoint,
hello: hello,
onStatus: { status in
Task { @MainActor in
appModel.bridgeStatusText = status
}
})
let resolvedToken = refreshed.isEmpty ? token : refreshed
if !refreshed.isEmpty, refreshed != token {
_ = KeychainStore.saveString(
refreshed,
service: "com.steipete.clawdis.bridge",
account: self.keychainAccount(instanceId: instanceId))
}
appModel.connectToBridge(endpoint: endpoint, hello: self.makeHello(token: resolvedToken))
} catch {
await MainActor.run {
appModel.bridgeStatusText = "Bridge error: \(error.localizedDescription)"
}
}
}
}
private func resolvedDisplayName(defaults: UserDefaults) -> String {
let key = "node.displayName"
let existing = defaults.string(forKey: key)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
@ -265,5 +326,13 @@ extension BridgeConnectionController {
func _test_appVersion() -> String {
self.appVersion()
}
func _test_setBridges(_ bridges: [BridgeDiscoveryModel.DiscoveredBridge]) {
self.bridges = bridges
}
func _test_triggerAutoConnect() {
self.maybeAutoConnect()
}
}
#endif

View File

@ -18,6 +18,12 @@ final class BridgeDiscoveryModel {
var endpoint: NWEndpoint
var stableID: String
var debugID: String
var lanHost: String?
var tailnetDns: String?
var gatewayPort: Int?
var bridgePort: Int?
var canvasPort: Int?
var cliPath: String?
}
var bridges: [DiscoveredBridge] = []
@ -68,7 +74,8 @@ final class BridgeDiscoveryModel {
switch result.endpoint {
case let .service(name, _, _, _):
let decodedName = BonjourEscapes.decode(name)
let advertisedName = result.endpoint.txtRecord?.dictionary["displayName"]
let txt = result.endpoint.txtRecord?.dictionary ?? [:]
let advertisedName = txt["displayName"]
let prettyAdvertised = advertisedName
.map(Self.prettifyInstanceName)
.flatMap { $0.isEmpty ? nil : $0 }
@ -77,7 +84,13 @@ final class BridgeDiscoveryModel {
name: prettyName,
endpoint: result.endpoint,
stableID: BridgeEndpointID.stableID(result.endpoint),
debugID: BridgeEndpointID.prettyDescription(result.endpoint))
debugID: BridgeEndpointID.prettyDescription(result.endpoint),
lanHost: Self.txtValue(txt, key: "lanHost"),
tailnetDns: Self.txtValue(txt, key: "tailnetDns"),
gatewayPort: Self.txtIntValue(txt, key: "gatewayPort"),
bridgePort: Self.txtIntValue(txt, key: "bridgePort"),
canvasPort: Self.txtIntValue(txt, key: "canvasPort"),
cliPath: Self.txtValue(txt, key: "cliPath"))
default:
return nil
}
@ -191,4 +204,14 @@ final class BridgeDiscoveryModel {
.replacingOccurrences(of: #"\s+\(\d+\)$"#, with: "", options: .regularExpression)
return stripped.trimmingCharacters(in: .whitespacesAndNewlines)
}
private static func txtValue(_ dict: [String: String], key: String) -> String? {
let raw = dict[key]?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
return raw.isEmpty ? nil : raw
}
private static func txtIntValue(_ dict: [String: String], key: String) -> Int? {
guard let raw = self.txtValue(dict, key: key) else { return nil }
return Int(raw)
}
}

View File

@ -84,10 +84,14 @@ actor CameraController {
}
withExtendedLifetime(delegate) {}
let maxPayloadBytes = 5 * 1024 * 1024
// Base64 inflates payloads by ~4/3; cap encoded bytes so the payload stays under 5MB (API limit).
let maxEncodedBytes = (maxPayloadBytes / 4) * 3
let res = try JPEGTranscoder.transcodeToJPEG(
imageData: rawData,
maxWidthPx: maxWidth,
quality: quality)
quality: quality,
maxBytes: maxEncodedBytes)
return (
format: format.rawValue,

View File

@ -4,18 +4,20 @@ import SwiftUI
struct ChatSheet: View {
@Environment(\.dismiss) private var dismiss
@State private var viewModel: ClawdisChatViewModel
private let userAccent: Color?
init(bridge: BridgeSession, sessionKey: String = "main") {
init(bridge: BridgeSession, sessionKey: String = "main", userAccent: Color? = nil) {
let transport = IOSBridgeChatTransport(bridge: bridge)
self._viewModel = State(
initialValue: ClawdisChatViewModel(
sessionKey: sessionKey,
transport: transport))
self.userAccent = userAccent
}
var body: some View {
NavigationStack {
ClawdisChatView(viewModel: self.viewModel)
ClawdisChatView(viewModel: self.viewModel, userAccent: self.userAccent)
.navigationTitle("Chat")
.navigationBarTitleDisplayMode(.inline)
.toolbar {

View File

@ -22,12 +22,15 @@ final class NodeAppModel {
var bridgeServerName: String?
var bridgeRemoteAddress: String?
var connectedBridgeID: String?
var seamColorHex: String?
var mainSessionKey: String = "main"
private let bridge = BridgeSession()
private var bridgeTask: Task<Void, Never>?
private var voiceWakeSyncTask: Task<Void, Never>?
@ObservationIgnored private var cameraHUDDismissTask: Task<Void, Never>?
let voiceWake = VoiceWakeManager()
let talkMode = TalkModeManager()
private var lastAutoA2uiURL: String?
var bridgeSession: BridgeSession { self.bridge }
@ -35,11 +38,12 @@ final class NodeAppModel {
var cameraHUDText: String?
var cameraHUDKind: CameraHUDKind?
var cameraFlashNonce: Int = 0
var screenRecordActive: Bool = false
init() {
self.voiceWake.configure { [weak self] cmd in
guard let self else { return }
let sessionKey = "main"
let sessionKey = await MainActor.run { self.mainSessionKey }
do {
try await self.sendVoiceTranscript(text: cmd, sessionKey: sessionKey)
} catch {
@ -49,6 +53,9 @@ final class NodeAppModel {
let enabled = UserDefaults.standard.bool(forKey: "voiceWake.enabled")
self.voiceWake.setEnabled(enabled)
self.talkMode.attachBridge(self.bridge)
let talkEnabled = UserDefaults.standard.bool(forKey: "talk.enabled")
self.talkMode.setEnabled(talkEnabled)
// Wire up deep links from canvas taps
self.screen.onDeepLink = { [weak self] url in
@ -145,7 +152,7 @@ final class NodeAppModel {
guard let raw = await self.bridge.currentCanvasHostUrl() else { return nil }
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty, let base = URL(string: trimmed) else { return nil }
return base.appendingPathComponent("__clawdis__/a2ui/").absoluteString
return base.appendingPathComponent("__clawdis__/a2ui/").absoluteString + "?platform=ios"
}
private func showA2UIOnConnectIfNeeded() async {
@ -177,6 +184,10 @@ final class NodeAppModel {
self.voiceWake.setEnabled(enabled)
}
func setTalkEnabled(_ enabled: Bool) {
self.talkMode.setEnabled(enabled)
}
func connectToBridge(
endpoint: NWEndpoint,
hello: BridgeHello)
@ -216,6 +227,7 @@ final class NodeAppModel {
self.bridgeRemoteAddress = addr
}
}
await self.refreshBrandingFromGateway()
await self.startVoiceWakeSync()
await self.showA2UIOnConnectIfNeeded()
},
@ -255,6 +267,8 @@ final class NodeAppModel {
self.bridgeServerName = nil
self.bridgeRemoteAddress = nil
self.connectedBridgeID = nil
self.seamColorHex = nil
self.mainSessionKey = "main"
self.showLocalCanvasOnDisconnect()
}
}
@ -270,9 +284,47 @@ final class NodeAppModel {
self.bridgeServerName = nil
self.bridgeRemoteAddress = nil
self.connectedBridgeID = nil
self.seamColorHex = nil
self.mainSessionKey = "main"
self.showLocalCanvasOnDisconnect()
}
var seamColor: Color {
Self.color(fromHex: self.seamColorHex) ?? Self.defaultSeamColor
}
private static let defaultSeamColor = Color(red: 79 / 255.0, green: 122 / 255.0, blue: 154 / 255.0)
private static func color(fromHex raw: String?) -> Color? {
let trimmed = (raw ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let hex = trimmed.hasPrefix("#") ? String(trimmed.dropFirst()) : trimmed
guard hex.count == 6, let value = Int(hex, radix: 16) else { return nil }
let r = Double((value >> 16) & 0xFF) / 255.0
let g = Double((value >> 8) & 0xFF) / 255.0
let b = Double(value & 0xFF) / 255.0
return Color(red: r, green: g, blue: b)
}
private func refreshBrandingFromGateway() async {
do {
let res = try await self.bridge.request(method: "config.get", paramsJSON: "{}", timeoutSeconds: 8)
guard let json = try JSONSerialization.jsonObject(with: res) as? [String: Any] else { return }
guard let config = json["config"] as? [String: Any] else { return }
let ui = config["ui"] as? [String: Any]
let raw = (ui?["seamColor"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
let session = config["session"] as? [String: Any]
let rawMainKey = (session?["mainKey"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
let mainKey = rawMainKey.isEmpty ? "main" : rawMainKey
await MainActor.run {
self.seamColorHex = raw.isEmpty ? nil : raw
self.mainSessionKey = mainKey
}
} catch {
// ignore
}
}
func setGlobalWakeWords(_ words: [String]) async {
let sanitized = VoiceWakePreferences.sanitizeTriggerWords(words)
@ -590,6 +642,9 @@ final class NodeAppModel {
NSLocalizedDescriptionKey: "INVALID_REQUEST: screen format must be mp4",
])
}
// Status pill mirrors screen recording state so it stays visible without overlay stacking.
self.screenRecordActive = true
defer { self.screenRecordActive = false }
let path = try await self.screenRecorder.record(
screenIndex: params.screenIndex,
durationMs: params.durationMs,

View File

@ -51,7 +51,10 @@ struct RootCanvas: View {
case .settings:
SettingsTab()
case .chat:
ChatSheet(bridge: self.appModel.bridgeSession)
ChatSheet(
bridge: self.appModel.bridgeSession,
sessionKey: self.appModel.mainSessionKey,
userAccent: self.appModel.seamColor)
}
}
.onAppear { self.updateIdleTimer() }
@ -119,6 +122,9 @@ struct RootCanvas: View {
}
private struct CanvasContent: View {
@Environment(NodeAppModel.self) private var appModel
@AppStorage("talk.enabled") private var talkEnabled: Bool = false
@AppStorage("talk.button.enabled") private var talkButtonEnabled: Bool = true
var systemColorScheme: ColorScheme
var bridgeStatus: StatusPill.BridgeState
var voiceWakeEnabled: Bool
@ -140,6 +146,21 @@ private struct CanvasContent: View {
}
.accessibilityLabel("Chat")
if self.talkButtonEnabled {
// Talk mode lives on a side bubble so it doesn't get buried in settings.
OverlayButton(
systemImage: self.appModel.talkMode.isEnabled ? "waveform.circle.fill" : "waveform.circle",
brighten: self.brightenButtons,
tint: self.appModel.seamColor,
isActive: self.appModel.talkMode.isEnabled)
{
let next = !self.appModel.talkMode.isEnabled
self.talkEnabled = next
self.appModel.setTalkEnabled(next)
}
.accessibilityLabel("Talk Mode")
}
OverlayButton(systemImage: "gearshape.fill", brighten: self.brightenButtons) {
self.openSettings()
}
@ -148,10 +169,17 @@ private struct CanvasContent: View {
.padding(.top, 10)
.padding(.trailing, 10)
}
.overlay(alignment: .center) {
if self.appModel.talkMode.isEnabled {
TalkOrbOverlay()
.transition(.opacity)
}
}
.overlay(alignment: .topLeading) {
StatusPill(
bridge: self.bridgeStatus,
voiceWakeEnabled: self.voiceWakeEnabled,
activity: self.statusActivity,
brighten: self.brightenButtons,
onTap: {
self.openSettings()
@ -169,45 +197,78 @@ private struct CanvasContent: View {
.transition(.move(edge: .top).combined(with: .opacity))
}
}
.overlay(alignment: .topLeading) {
if let cameraHUDText, !cameraHUDText.isEmpty, let cameraHUDKind {
CameraCaptureToast(
text: cameraHUDText,
kind: self.mapCameraKind(cameraHUDKind),
brighten: self.brightenButtons)
.padding(SwiftUI.Edge.Set.leading, 10)
.safeAreaPadding(SwiftUI.Edge.Set.top, 106)
.transition(
AnyTransition.move(edge: SwiftUI.Edge.top)
.combined(with: AnyTransition.opacity))
}
}
}
private func mapCameraKind(_ kind: NodeAppModel.CameraHUDKind) -> CameraCaptureToast.Kind {
switch kind {
case .photo:
.photo
case .recording:
.recording
case .success:
.success
case .error:
.error
private var statusActivity: StatusPill.Activity? {
// Status pill owns transient activity state so it doesn't overlap the connection indicator.
if self.appModel.isBackgrounded {
return StatusPill.Activity(
title: "Foreground required",
systemImage: "exclamationmark.triangle.fill",
tint: .orange)
}
let bridgeStatus = self.appModel.bridgeStatusText.trimmingCharacters(in: .whitespacesAndNewlines)
let bridgeLower = bridgeStatus.lowercased()
if bridgeLower.contains("repair") {
return StatusPill.Activity(title: "Repairing…", systemImage: "wrench.and.screwdriver", tint: .orange)
}
if bridgeLower.contains("approval") || bridgeLower.contains("pairing") {
return StatusPill.Activity(title: "Approval pending", systemImage: "person.crop.circle.badge.clock")
}
// Avoid duplicating the primary bridge status ("Connecting") in the activity slot.
if self.appModel.screenRecordActive {
return StatusPill.Activity(title: "Recording screen…", systemImage: "record.circle.fill", tint: .red)
}
if let cameraHUDText, !cameraHUDText.isEmpty, let cameraHUDKind {
let systemImage: String
let tint: Color?
switch cameraHUDKind {
case .photo:
systemImage = "camera.fill"
tint = nil
case .recording:
systemImage = "video.fill"
tint = .red
case .success:
systemImage = "checkmark.circle.fill"
tint = .green
case .error:
systemImage = "exclamationmark.triangle.fill"
tint = .red
}
return StatusPill.Activity(title: cameraHUDText, systemImage: systemImage, tint: tint)
}
if self.voiceWakeEnabled {
let voiceStatus = self.appModel.voiceWake.statusText
if voiceStatus.localizedCaseInsensitiveContains("microphone permission") {
return StatusPill.Activity(title: "Mic permission", systemImage: "mic.slash", tint: .orange)
}
if voiceStatus == "Paused" {
let suffix = self.appModel.isBackgrounded ? " (background)" : ""
return StatusPill.Activity(title: "Voice Wake paused\(suffix)", systemImage: "pause.circle.fill")
}
}
return nil
}
}
private struct OverlayButton: View {
let systemImage: String
let brighten: Bool
var tint: Color?
var isActive: Bool = false
let action: () -> Void
var body: some View {
Button(action: self.action) {
Image(systemName: self.systemImage)
.font(.system(size: 16, weight: .semibold))
.foregroundStyle(.primary)
.foregroundStyle(self.isActive ? (self.tint ?? .primary) : .primary)
.padding(10)
.background {
RoundedRectangle(cornerRadius: 12, style: .continuous)
@ -225,9 +286,26 @@ private struct OverlayButton: View {
endPoint: .bottomTrailing))
.blendMode(.overlay)
}
.overlay {
if let tint {
RoundedRectangle(cornerRadius: 12, style: .continuous)
.fill(
LinearGradient(
colors: [
tint.opacity(self.isActive ? 0.22 : 0.14),
tint.opacity(self.isActive ? 0.10 : 0.06),
.clear,
],
startPoint: .topLeading,
endPoint: .bottomTrailing))
.blendMode(.overlay)
}
}
.overlay {
RoundedRectangle(cornerRadius: 12, style: .continuous)
.strokeBorder(.white.opacity(self.brighten ? 0.24 : 0.18), lineWidth: 0.5)
.strokeBorder(
(self.tint ?? .white).opacity(self.isActive ? 0.34 : (self.brighten ? 0.24 : 0.18)),
lineWidth: self.isActive ? 0.7 : 0.5)
}
.shadow(color: .black.opacity(0.35), radius: 12, y: 6)
}
@ -261,59 +339,3 @@ private struct CameraFlashOverlay: View {
}
}
}
private struct CameraCaptureToast: View {
enum Kind {
case photo
case recording
case success
case error
}
var text: String
var kind: Kind
var brighten: Bool = false
var body: some View {
HStack(spacing: 10) {
self.icon
.font(.system(size: 14, weight: .semibold))
.foregroundStyle(.primary)
Text(self.text)
.font(.system(size: 14, weight: .semibold))
.foregroundStyle(.primary)
.lineLimit(1)
.truncationMode(.tail)
}
.padding(.vertical, 10)
.padding(.horizontal, 12)
.background {
RoundedRectangle(cornerRadius: 14, style: .continuous)
.fill(.ultraThinMaterial)
.overlay {
RoundedRectangle(cornerRadius: 14, style: .continuous)
.strokeBorder(.white.opacity(self.brighten ? 0.24 : 0.18), lineWidth: 0.5)
}
.shadow(color: .black.opacity(0.25), radius: 12, y: 6)
}
.accessibilityLabel("Camera")
.accessibilityValue(self.text)
}
@ViewBuilder
private var icon: some View {
switch self.kind {
case .photo:
Image(systemName: "camera.fill")
case .recording:
Image(systemName: "record.circle.fill")
.symbolRenderingMode(.palette)
.foregroundStyle(.red, .primary)
case .success:
Image(systemName: "checkmark.circle.fill")
case .error:
Image(systemName: "exclamationmark.triangle.fill")
}
}
}

View File

@ -26,6 +26,7 @@ struct RootTabs: View {
StatusPill(
bridge: self.bridgeStatus,
voiceWakeEnabled: self.voiceWakeEnabled,
activity: self.statusActivity,
onTap: { self.selectedTab = 2 })
.padding(.leading, 10)
.safeAreaPadding(.top, 10)
@ -79,4 +80,64 @@ struct RootTabs: View {
return .disconnected
}
private var statusActivity: StatusPill.Activity? {
// Keep the top pill consistent across tabs (camera + voice wake + pairing states).
if self.appModel.isBackgrounded {
return StatusPill.Activity(
title: "Foreground required",
systemImage: "exclamationmark.triangle.fill",
tint: .orange)
}
let bridgeStatus = self.appModel.bridgeStatusText.trimmingCharacters(in: .whitespacesAndNewlines)
let bridgeLower = bridgeStatus.lowercased()
if bridgeLower.contains("repair") {
return StatusPill.Activity(title: "Repairing…", systemImage: "wrench.and.screwdriver", tint: .orange)
}
if bridgeLower.contains("approval") || bridgeLower.contains("pairing") {
return StatusPill.Activity(title: "Approval pending", systemImage: "person.crop.circle.badge.clock")
}
// Avoid duplicating the primary bridge status ("Connecting") in the activity slot.
if self.appModel.screenRecordActive {
return StatusPill.Activity(title: "Recording screen…", systemImage: "record.circle.fill", tint: .red)
}
if let cameraHUDText = self.appModel.cameraHUDText,
let cameraHUDKind = self.appModel.cameraHUDKind,
!cameraHUDText.isEmpty
{
let systemImage: String
let tint: Color?
switch cameraHUDKind {
case .photo:
systemImage = "camera.fill"
tint = nil
case .recording:
systemImage = "video.fill"
tint = .red
case .success:
systemImage = "checkmark.circle.fill"
tint = .green
case .error:
systemImage = "exclamationmark.triangle.fill"
tint = .red
}
return StatusPill.Activity(title: cameraHUDText, systemImage: systemImage, tint: tint)
}
if self.voiceWakeEnabled {
let voiceStatus = self.appModel.voiceWake.statusText
if voiceStatus.localizedCaseInsensitiveContains("microphone permission") {
return StatusPill.Activity(title: "Mic permission", systemImage: "mic.slash", tint: .orange)
}
if voiceStatus == "Paused" {
let suffix = self.appModel.isBackgrounded ? " (background)" : ""
return StatusPill.Activity(title: "Voice Wake paused\(suffix)", systemImage: "pause.circle.fill")
}
}
return nil
}
}

View File

@ -43,9 +43,7 @@ final class ScreenController {
self.webView.scrollView.contentInset = .zero
self.webView.scrollView.scrollIndicatorInsets = .zero
self.webView.scrollView.automaticallyAdjustsScrollIndicatorInsets = false
// Disable scroll to allow touch events to pass through to canvas
self.webView.scrollView.isScrollEnabled = false
self.webView.scrollView.bounces = false
self.applyScrollBehavior()
self.webView.navigationDelegate = self.navigationDelegate
self.navigationDelegate.controller = self
a2uiActionHandler.controller = self
@ -60,6 +58,7 @@ final class ScreenController {
func reload() {
let trimmed = self.urlString.trimmingCharacters(in: .whitespacesAndNewlines)
self.applyScrollBehavior()
if trimmed.isEmpty {
guard let url = Self.canvasScaffoldURL else { return }
self.errorText = nil
@ -250,6 +249,15 @@ final class ScreenController {
return false
}
private func applyScrollBehavior() {
let trimmed = self.urlString.trimmingCharacters(in: .whitespacesAndNewlines)
let allowScroll = !trimmed.isEmpty
let scrollView = self.webView.scrollView
// Default canvas needs raw touch events; external pages should scroll.
scrollView.isScrollEnabled = allowScroll
scrollView.bounces = allowScroll
}
private static func jsValue(_ value: String?) -> String {
guard let value else { return "null" }
if let data = try? JSONSerialization.data(withJSONObject: [value]),

View File

@ -1,12 +1,28 @@
import AVFoundation
import ReplayKit
@MainActor
final class ScreenRecordService {
final class ScreenRecordService: @unchecked Sendable {
private struct UncheckedSendableBox<T>: @unchecked Sendable {
let value: T
}
private final class CaptureState: @unchecked Sendable {
private let lock = NSLock()
var writer: AVAssetWriter?
var videoInput: AVAssetWriterInput?
var audioInput: AVAssetWriterInput?
var started = false
var sawVideo = false
var lastVideoTime: CMTime?
var handlerError: Error?
func withLock<T>(_ body: (CaptureState) -> T) -> T {
self.lock.lock()
defer { lock.unlock() }
return body(self)
}
}
enum ScreenRecordError: LocalizedError {
case invalidScreenIndex(Int)
case captureFailed(String)
@ -51,126 +67,158 @@ final class ScreenRecordService {
}()
try? FileManager.default.removeItem(at: outURL)
let recorder = RPScreenRecorder.shared()
recorder.isMicrophoneEnabled = includeAudio
var writer: AVAssetWriter?
var videoInput: AVAssetWriterInput?
var audioInput: AVAssetWriterInput?
var started = false
var sawVideo = false
var lastVideoTime: CMTime?
var handlerError: Error?
let lock = NSLock()
func setHandlerError(_ error: Error) {
lock.lock()
defer { lock.unlock() }
if handlerError == nil { handlerError = error }
}
let state = CaptureState()
let recordQueue = DispatchQueue(label: "com.steipete.clawdis.screenrecord")
try await withCheckedThrowingContinuation { (cont: CheckedContinuation<Void, Error>) in
recorder.startCapture(handler: { sample, type, error in
if let error {
setHandlerError(error)
return
}
guard CMSampleBufferDataIsReady(sample) else { return }
switch type {
case .video:
let pts = CMSampleBufferGetPresentationTimeStamp(sample)
if let lastVideoTime {
let delta = CMTimeSubtract(pts, lastVideoTime)
if delta.seconds < (1.0 / fpsValue) { return }
}
if writer == nil {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sample) else {
setHandlerError(ScreenRecordError.captureFailed("Missing image buffer"))
return
let handler: @Sendable (CMSampleBuffer, RPSampleBufferType, Error?) -> Void = { sample, type, error in
// ReplayKit can call the capture handler on a background queue.
// Serialize writes to avoid queue asserts.
recordQueue.async {
if let error {
state.withLock { state in
if state.handlerError == nil { state.handlerError = error }
}
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
do {
let w = try AVAssetWriter(outputURL: outURL, fileType: .mp4)
let settings: [String: Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: width,
AVVideoHeightKey: height,
]
let vInput = AVAssetWriterInput(mediaType: .video, outputSettings: settings)
vInput.expectsMediaDataInRealTime = true
guard w.canAdd(vInput) else {
throw ScreenRecordError.writeFailed("Cannot add video input")
}
w.add(vInput)
return
}
guard CMSampleBufferDataIsReady(sample) else { return }
if includeAudio {
let aInput = AVAssetWriterInput(mediaType: .audio, outputSettings: nil)
aInput.expectsMediaDataInRealTime = true
if w.canAdd(aInput) {
w.add(aInput)
audioInput = aInput
switch type {
case .video:
let pts = CMSampleBufferGetPresentationTimeStamp(sample)
let shouldSkip = state.withLock { state in
if let lastVideoTime = state.lastVideoTime {
let delta = CMTimeSubtract(pts, lastVideoTime)
return delta.seconds < (1.0 / fpsValue)
}
return false
}
if shouldSkip { return }
if state.withLock({ $0.writer == nil }) {
guard let imageBuffer = CMSampleBufferGetImageBuffer(sample) else {
state.withLock { state in
if state.handlerError == nil {
state.handlerError = ScreenRecordError.captureFailed("Missing image buffer")
}
}
return
}
let width = CVPixelBufferGetWidth(imageBuffer)
let height = CVPixelBufferGetHeight(imageBuffer)
do {
let w = try AVAssetWriter(outputURL: outURL, fileType: .mp4)
let settings: [String: Any] = [
AVVideoCodecKey: AVVideoCodecType.h264,
AVVideoWidthKey: width,
AVVideoHeightKey: height,
]
let vInput = AVAssetWriterInput(mediaType: .video, outputSettings: settings)
vInput.expectsMediaDataInRealTime = true
guard w.canAdd(vInput) else {
throw ScreenRecordError.writeFailed("Cannot add video input")
}
w.add(vInput)
if includeAudio {
let aInput = AVAssetWriterInput(mediaType: .audio, outputSettings: nil)
aInput.expectsMediaDataInRealTime = true
if w.canAdd(aInput) {
w.add(aInput)
state.withLock { state in
state.audioInput = aInput
}
}
}
guard w.startWriting() else {
throw ScreenRecordError
.writeFailed(w.error?.localizedDescription ?? "Failed to start writer")
}
w.startSession(atSourceTime: pts)
state.withLock { state in
state.writer = w
state.videoInput = vInput
state.started = true
}
} catch {
state.withLock { state in
if state.handlerError == nil { state.handlerError = error }
}
return
}
}
let vInput = state.withLock { $0.videoInput }
let isStarted = state.withLock { $0.started }
guard let vInput, isStarted else { return }
if vInput.isReadyForMoreMediaData {
if vInput.append(sample) {
state.withLock { state in
state.sawVideo = true
state.lastVideoTime = pts
}
} else {
let err = state.withLock { $0.writer?.error }
if let err {
state.withLock { state in
if state.handlerError == nil {
state.handlerError = ScreenRecordError.writeFailed(err.localizedDescription)
}
}
}
}
guard w.startWriting() else {
throw ScreenRecordError
.writeFailed(w.error?.localizedDescription ?? "Failed to start writer")
}
w.startSession(atSourceTime: pts)
writer = w
videoInput = vInput
started = true
} catch {
setHandlerError(error)
return
}
}
guard let vInput = videoInput, started else { return }
if vInput.isReadyForMoreMediaData {
if vInput.append(sample) {
sawVideo = true
lastVideoTime = pts
} else {
if let err = writer?.error {
setHandlerError(ScreenRecordError.writeFailed(err.localizedDescription))
}
case .audioApp, .audioMic:
let aInput = state.withLock { $0.audioInput }
let isStarted = state.withLock { $0.started }
guard includeAudio, let aInput, isStarted else { return }
if aInput.isReadyForMoreMediaData {
_ = aInput.append(sample)
}
}
case .audioApp, .audioMic:
guard includeAudio, let aInput = audioInput, started else { return }
if aInput.isReadyForMoreMediaData {
_ = aInput.append(sample)
@unknown default:
break
}
@unknown default:
break
}
}, completionHandler: { error in
}
let completion: @Sendable (Error?) -> Void = { error in
if let error { cont.resume(throwing: error) } else { cont.resume() }
})
}
Task { @MainActor in
startReplayKitCapture(
includeAudio: includeAudio,
handler: handler,
completion: completion)
}
}
try await Task.sleep(nanoseconds: UInt64(durationMs) * 1_000_000)
let stopError = await withCheckedContinuation { cont in
recorder.stopCapture { error in cont.resume(returning: error) }
Task { @MainActor in
stopReplayKitCapture { error in cont.resume(returning: error) }
}
}
if let stopError { throw stopError }
if let handlerError { throw handlerError }
guard let writer, let videoInput, sawVideo else {
let handlerErrorSnapshot = state.withLock { $0.handlerError }
if let handlerErrorSnapshot { throw handlerErrorSnapshot }
let writerSnapshot = state.withLock { $0.writer }
let videoInputSnapshot = state.withLock { $0.videoInput }
let audioInputSnapshot = state.withLock { $0.audioInput }
let sawVideoSnapshot = state.withLock { $0.sawVideo }
guard let writerSnapshot, let videoInputSnapshot, sawVideoSnapshot else {
throw ScreenRecordError.captureFailed("No frames captured")
}
videoInput.markAsFinished()
audioInput?.markAsFinished()
videoInputSnapshot.markAsFinished()
audioInputSnapshot?.markAsFinished()
let writerBox = UncheckedSendableBox(value: writer)
let writerBox = UncheckedSendableBox(value: writerSnapshot)
try await withCheckedThrowingContinuation { (cont: CheckedContinuation<Void, Error>) in
writerBox.value.finishWriting {
let writer = writerBox.value
@ -199,6 +247,22 @@ final class ScreenRecordService {
}
}
@MainActor
private func startReplayKitCapture(
includeAudio: Bool,
handler: @escaping @Sendable (CMSampleBuffer, RPSampleBufferType, Error?) -> Void,
completion: @escaping @Sendable (Error?) -> Void)
{
let recorder = RPScreenRecorder.shared()
recorder.isMicrophoneEnabled = includeAudio
recorder.startCapture(handler: handler, completionHandler: completion)
}
@MainActor
private func stopReplayKitCapture(_ completion: @escaping @Sendable (Error?) -> Void) {
RPScreenRecorder.shared().stopCapture { error in completion(error) }
}
#if DEBUG
extension ScreenRecordService {
nonisolated static func _test_clampDurationMs(_ ms: Int?) -> Int {

View File

@ -20,6 +20,8 @@ struct SettingsTab: View {
@AppStorage("node.displayName") private var displayName: String = "iOS Node"
@AppStorage("node.instanceId") private var instanceId: String = UUID().uuidString
@AppStorage("voiceWake.enabled") private var voiceWakeEnabled: Bool = false
@AppStorage("talk.enabled") private var talkEnabled: Bool = false
@AppStorage("talk.button.enabled") private var talkButtonEnabled: Bool = true
@AppStorage("camera.enabled") private var cameraEnabled: Bool = true
@AppStorage("screen.preventSleep") private var preventSleep: Bool = true
@AppStorage("bridge.preferredStableID") private var preferredBridgeStableID: String = ""
@ -51,6 +53,9 @@ struct SettingsTab: View {
}
}
}
LabeledContent("Platform", value: self.platformString())
LabeledContent("Version", value: self.appVersion())
LabeledContent("Model", value: self.modelIdentifier())
}
Section("Bridge") {
@ -153,6 +158,12 @@ struct SettingsTab: View {
.onChange(of: self.voiceWakeEnabled) { _, newValue in
self.appModel.setVoiceWakeEnabled(newValue)
}
Toggle("Talk Mode", isOn: self.$talkEnabled)
.onChange(of: self.talkEnabled) { _, newValue in
self.appModel.setTalkEnabled(newValue)
}
// Keep this separate so users can hide the side bubble without disabling Talk Mode.
Toggle("Show Talk Button", isOn: self.$talkButtonEnabled)
NavigationLink {
VoiceWakeWordsSettingsView()
@ -227,6 +238,12 @@ struct SettingsTab: View {
HStack {
VStack(alignment: .leading, spacing: 2) {
Text(bridge.name)
let detailLines = self.bridgeDetailLines(bridge)
ForEach(detailLines, id: \.self) { line in
Text(line)
.font(.footnote)
.foregroundStyle(.secondary)
}
}
Spacer()
@ -504,4 +521,26 @@ struct SettingsTab: View {
private static func httpURLString(host: String?, port: Int?, fallback: String) -> String {
SettingsNetworkingHelpers.httpURLString(host: host, port: port, fallback: fallback)
}
private func bridgeDetailLines(_ bridge: BridgeDiscoveryModel.DiscoveredBridge) -> [String] {
var lines: [String] = []
if let lanHost = bridge.lanHost { lines.append("LAN: \(lanHost)") }
if let tailnet = bridge.tailnetDns { lines.append("Tailnet: \(tailnet)") }
let gatewayPort = bridge.gatewayPort
let bridgePort = bridge.bridgePort
let canvasPort = bridge.canvasPort
if gatewayPort != nil || bridgePort != nil || canvasPort != nil {
let gw = gatewayPort.map(String.init) ?? ""
let br = bridgePort.map(String.init) ?? ""
let canvas = canvasPort.map(String.init) ?? ""
lines.append("Ports: gw \(gw) · bridge \(br) · canvas \(canvas)")
}
if lines.isEmpty {
lines.append(bridge.debugID)
}
return lines
}
}

View File

@ -28,8 +28,15 @@ struct StatusPill: View {
}
}
struct Activity: Equatable {
var title: String
var systemImage: String
var tint: Color?
}
var bridge: BridgeState
var voiceWakeEnabled: Bool
var activity: Activity?
var brighten: Bool = false
var onTap: () -> Void
@ -54,10 +61,24 @@ struct StatusPill: View {
.frame(height: 14)
.opacity(0.35)
Image(systemName: self.voiceWakeEnabled ? "mic.fill" : "mic.slash")
.font(.system(size: 13, weight: .semibold))
.foregroundStyle(self.voiceWakeEnabled ? .primary : .secondary)
.accessibilityLabel(self.voiceWakeEnabled ? "Voice Wake enabled" : "Voice Wake disabled")
if let activity {
HStack(spacing: 6) {
Image(systemName: activity.systemImage)
.font(.system(size: 13, weight: .semibold))
.foregroundStyle(activity.tint ?? .primary)
Text(activity.title)
.font(.system(size: 13, weight: .semibold))
.foregroundStyle(.primary)
.lineLimit(1)
}
.transition(.opacity.combined(with: .move(edge: .top)))
} else {
Image(systemName: self.voiceWakeEnabled ? "mic.fill" : "mic.slash")
.font(.system(size: 13, weight: .semibold))
.foregroundStyle(self.voiceWakeEnabled ? .primary : .secondary)
.accessibilityLabel(self.voiceWakeEnabled ? "Voice Wake enabled" : "Voice Wake disabled")
.transition(.opacity.combined(with: .move(edge: .top)))
}
}
.padding(.vertical, 8)
.padding(.horizontal, 12)
@ -73,7 +94,7 @@ struct StatusPill: View {
}
.buttonStyle(.plain)
.accessibilityLabel("Status")
.accessibilityValue("\(self.bridge.title), Voice Wake \(self.voiceWakeEnabled ? "enabled" : "disabled")")
.accessibilityValue(self.accessibilityValue)
.onAppear { self.updatePulse(for: self.bridge, scenePhase: self.scenePhase) }
.onDisappear { self.pulse = false }
.onChange(of: self.bridge) { _, newValue in
@ -82,6 +103,14 @@ struct StatusPill: View {
.onChange(of: self.scenePhase) { _, newValue in
self.updatePulse(for: self.bridge, scenePhase: newValue)
}
.animation(.easeInOut(duration: 0.18), value: self.activity?.title)
}
private var accessibilityValue: String {
if let activity {
return "\(self.bridge.title), \(activity.title)"
}
return "\(self.bridge.title), Voice Wake \(self.voiceWakeEnabled ? "enabled" : "disabled")"
}
private func updatePulse(for bridge: BridgeState, scenePhase: ScenePhase) {

View File

@ -0,0 +1,713 @@
import AVFAudio
import ClawdisKit
import Foundation
import Observation
import OSLog
import Speech
@MainActor
@Observable
final class TalkModeManager: NSObject {
private typealias SpeechRequest = SFSpeechAudioBufferRecognitionRequest
private static let defaultModelIdFallback = "eleven_v3"
var isEnabled: Bool = false
var isListening: Bool = false
var isSpeaking: Bool = false
var statusText: String = "Off"
private let audioEngine = AVAudioEngine()
private var speechRecognizer: SFSpeechRecognizer?
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private var silenceTask: Task<Void, Never>?
private var lastHeard: Date?
private var lastTranscript: String = ""
private var lastSpokenText: String?
private var lastInterruptedAtSeconds: Double?
private var defaultVoiceId: String?
private var currentVoiceId: String?
private var defaultModelId: String?
private var currentModelId: String?
private var voiceOverrideActive = false
private var modelOverrideActive = false
private var defaultOutputFormat: String?
private var apiKey: String?
private var voiceAliases: [String: String] = [:]
private var interruptOnSpeech: Bool = true
private var mainSessionKey: String = "main"
private var fallbackVoiceId: String?
private var lastPlaybackWasPCM: Bool = false
var pcmPlayer: PCMStreamingAudioPlaying = PCMStreamingAudioPlayer.shared
var mp3Player: StreamingAudioPlaying = StreamingAudioPlayer.shared
private var bridge: BridgeSession?
private let silenceWindow: TimeInterval = 0.7
private var chatSubscribedSessionKeys = Set<String>()
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "TalkMode")
func attachBridge(_ bridge: BridgeSession) {
self.bridge = bridge
}
func setEnabled(_ enabled: Bool) {
self.isEnabled = enabled
if enabled {
self.logger.info("enabled")
Task { await self.start() }
} else {
self.logger.info("disabled")
self.stop()
}
}
func start() async {
guard self.isEnabled else { return }
if self.isListening { return }
self.logger.info("start")
self.statusText = "Requesting permissions…"
let micOk = await Self.requestMicrophonePermission()
guard micOk else {
self.logger.warning("start blocked: microphone permission denied")
self.statusText = "Microphone permission denied"
return
}
let speechOk = await Self.requestSpeechPermission()
guard speechOk else {
self.logger.warning("start blocked: speech permission denied")
self.statusText = "Speech recognition permission denied"
return
}
await self.reloadConfig()
do {
try Self.configureAudioSession()
try self.startRecognition()
self.isListening = true
self.statusText = "Listening"
self.startSilenceMonitor()
await self.subscribeChatIfNeeded(sessionKey: self.mainSessionKey)
self.logger.info("listening")
} catch {
self.isListening = false
self.statusText = "Start failed: \(error.localizedDescription)"
self.logger.error("start failed: \(error.localizedDescription, privacy: .public)")
}
}
func stop() {
self.isEnabled = false
self.isListening = false
self.statusText = "Off"
self.lastTranscript = ""
self.lastHeard = nil
self.silenceTask?.cancel()
self.silenceTask = nil
self.stopRecognition()
self.stopSpeaking()
self.lastInterruptedAtSeconds = nil
TalkSystemSpeechSynthesizer.shared.stop()
do {
try AVAudioSession.sharedInstance().setActive(false, options: [.notifyOthersOnDeactivation])
} catch {
self.logger.warning("audio session deactivate failed: \(error.localizedDescription, privacy: .public)")
}
Task { await self.unsubscribeAllChats() }
}
func userTappedOrb() {
self.stopSpeaking()
}
private func startRecognition() throws {
self.stopRecognition()
self.speechRecognizer = SFSpeechRecognizer()
guard let recognizer = self.speechRecognizer else {
throw NSError(domain: "TalkMode", code: 1, userInfo: [
NSLocalizedDescriptionKey: "Speech recognizer unavailable",
])
}
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
self.recognitionRequest?.shouldReportPartialResults = true
guard let request = self.recognitionRequest else { return }
let input = self.audioEngine.inputNode
let format = input.outputFormat(forBus: 0)
input.removeTap(onBus: 0)
let tapBlock = Self.makeAudioTapAppendCallback(request: request)
input.installTap(onBus: 0, bufferSize: 2048, format: format, block: tapBlock)
self.audioEngine.prepare()
try self.audioEngine.start()
self.recognitionTask = recognizer.recognitionTask(with: request) { [weak self] result, error in
guard let self else { return }
if let error {
if !self.isSpeaking {
self.statusText = "Speech error: \(error.localizedDescription)"
}
self.logger.debug("speech recognition error: \(error.localizedDescription, privacy: .public)")
}
guard let result else { return }
let transcript = result.bestTranscription.formattedString
Task { @MainActor in
await self.handleTranscript(transcript: transcript, isFinal: result.isFinal)
}
}
}
private func stopRecognition() {
self.recognitionTask?.cancel()
self.recognitionTask = nil
self.recognitionRequest?.endAudio()
self.recognitionRequest = nil
self.audioEngine.inputNode.removeTap(onBus: 0)
self.audioEngine.stop()
self.speechRecognizer = nil
}
private nonisolated static func makeAudioTapAppendCallback(request: SpeechRequest) -> AVAudioNodeTapBlock {
{ buffer, _ in
request.append(buffer)
}
}
private func handleTranscript(transcript: String, isFinal: Bool) async {
let trimmed = transcript.trimmingCharacters(in: .whitespacesAndNewlines)
if self.isSpeaking, self.interruptOnSpeech {
if self.shouldInterrupt(with: trimmed) {
self.stopSpeaking()
}
return
}
guard self.isListening else { return }
if !trimmed.isEmpty {
self.lastTranscript = trimmed
self.lastHeard = Date()
}
if isFinal {
self.lastTranscript = trimmed
}
}
private func startSilenceMonitor() {
self.silenceTask?.cancel()
self.silenceTask = Task { [weak self] in
guard let self else { return }
while self.isEnabled {
try? await Task.sleep(nanoseconds: 200_000_000)
await self.checkSilence()
}
}
}
private func checkSilence() async {
guard self.isListening, !self.isSpeaking else { return }
let transcript = self.lastTranscript.trimmingCharacters(in: .whitespacesAndNewlines)
guard !transcript.isEmpty else { return }
guard let lastHeard else { return }
if Date().timeIntervalSince(lastHeard) < self.silenceWindow { return }
await self.finalizeTranscript(transcript)
}
private func finalizeTranscript(_ transcript: String) async {
self.isListening = false
self.statusText = "Thinking…"
self.lastTranscript = ""
self.lastHeard = nil
self.stopRecognition()
await self.reloadConfig()
let prompt = self.buildPrompt(transcript: transcript)
guard let bridge else {
self.statusText = "Bridge not connected"
self.logger.warning("finalize: bridge not connected")
await self.start()
return
}
do {
let startedAt = Date().timeIntervalSince1970
let sessionKey = self.mainSessionKey
await self.subscribeChatIfNeeded(sessionKey: sessionKey)
self.logger.info(
"chat.send start sessionKey=\(sessionKey, privacy: .public) chars=\(prompt.count, privacy: .public)")
let runId = try await self.sendChat(prompt, bridge: bridge)
self.logger.info("chat.send ok runId=\(runId, privacy: .public)")
let completion = await self.waitForChatCompletion(runId: runId, bridge: bridge, timeoutSeconds: 120)
if completion == .timeout {
self.logger.warning(
"chat completion timeout runId=\(runId, privacy: .public); attempting history fallback")
} else if completion == .aborted {
self.statusText = "Aborted"
self.logger.warning("chat completion aborted runId=\(runId, privacy: .public)")
await self.start()
return
} else if completion == .error {
self.statusText = "Chat error"
self.logger.warning("chat completion error runId=\(runId, privacy: .public)")
await self.start()
return
}
guard let assistantText = try await self.waitForAssistantText(
bridge: bridge,
since: startedAt,
timeoutSeconds: completion == .final ? 12 : 25)
else {
self.statusText = "No reply"
self.logger.warning("assistant text timeout runId=\(runId, privacy: .public)")
await self.start()
return
}
self.logger.info("assistant text ok chars=\(assistantText.count, privacy: .public)")
await self.playAssistant(text: assistantText)
} catch {
self.statusText = "Talk failed: \(error.localizedDescription)"
self.logger.error("finalize failed: \(error.localizedDescription, privacy: .public)")
}
await self.start()
}
private func subscribeChatIfNeeded(sessionKey: String) async {
let key = sessionKey.trimmingCharacters(in: .whitespacesAndNewlines)
guard !key.isEmpty else { return }
guard let bridge else { return }
guard !self.chatSubscribedSessionKeys.contains(key) else { return }
do {
let payload = "{\"sessionKey\":\"\(key)\"}"
try await bridge.sendEvent(event: "chat.subscribe", payloadJSON: payload)
self.chatSubscribedSessionKeys.insert(key)
self.logger.info("chat.subscribe ok sessionKey=\(key, privacy: .public)")
} catch {
self.logger
.warning(
"chat.subscribe failed sessionKey=\(key, privacy: .public) err=\(error.localizedDescription, privacy: .public)")
}
}
private func unsubscribeAllChats() async {
guard let bridge else { return }
let keys = self.chatSubscribedSessionKeys
self.chatSubscribedSessionKeys.removeAll()
for key in keys {
do {
let payload = "{\"sessionKey\":\"\(key)\"}"
try await bridge.sendEvent(event: "chat.unsubscribe", payloadJSON: payload)
} catch {
// ignore
}
}
}
private func buildPrompt(transcript: String) -> String {
let interrupted = self.lastInterruptedAtSeconds
self.lastInterruptedAtSeconds = nil
return TalkPromptBuilder.build(transcript: transcript, interruptedAtSeconds: interrupted)
}
private enum ChatCompletionState: CustomStringConvertible {
case final
case aborted
case error
case timeout
var description: String {
switch self {
case .final: "final"
case .aborted: "aborted"
case .error: "error"
case .timeout: "timeout"
}
}
}
private func sendChat(_ message: String, bridge: BridgeSession) async throws -> String {
struct SendResponse: Decodable { let runId: String }
let payload: [String: Any] = [
"sessionKey": self.mainSessionKey,
"message": message,
"thinking": "low",
"timeoutMs": 30000,
"idempotencyKey": UUID().uuidString,
]
let data = try JSONSerialization.data(withJSONObject: payload)
let json = String(decoding: data, as: UTF8.self)
let res = try await bridge.request(method: "chat.send", paramsJSON: json, timeoutSeconds: 30)
let decoded = try JSONDecoder().decode(SendResponse.self, from: res)
return decoded.runId
}
private func waitForChatCompletion(
runId: String,
bridge: BridgeSession,
timeoutSeconds: Int = 120) async -> ChatCompletionState
{
let stream = await bridge.subscribeServerEvents(bufferingNewest: 200)
return await withTaskGroup(of: ChatCompletionState.self) { group in
group.addTask { [runId] in
for await evt in stream {
if Task.isCancelled { return .timeout }
guard evt.event == "chat", let payload = evt.payloadJSON else { continue }
guard let data = payload.data(using: .utf8) else { continue }
guard let json = try? JSONSerialization.jsonObject(with: data) as? [String: Any] else { continue }
if (json["runId"] as? String) != runId { continue }
if let state = json["state"] as? String {
switch state {
case "final": return .final
case "aborted": return .aborted
case "error": return .error
default: break
}
}
}
return .timeout
}
group.addTask {
try? await Task.sleep(nanoseconds: UInt64(timeoutSeconds) * 1_000_000_000)
return .timeout
}
let result = await group.next() ?? .timeout
group.cancelAll()
return result
}
}
private func waitForAssistantText(
bridge: BridgeSession,
since: Double,
timeoutSeconds: Int) async throws -> String?
{
let deadline = Date().addingTimeInterval(TimeInterval(timeoutSeconds))
while Date() < deadline {
if let text = try await self.fetchLatestAssistantText(bridge: bridge, since: since) {
return text
}
try? await Task.sleep(nanoseconds: 300_000_000)
}
return nil
}
private func fetchLatestAssistantText(bridge: BridgeSession, since: Double? = nil) async throws -> String? {
let res = try await bridge.request(
method: "chat.history",
paramsJSON: "{\"sessionKey\":\"\(self.mainSessionKey)\"}",
timeoutSeconds: 15)
guard let json = try JSONSerialization.jsonObject(with: res) as? [String: Any] else { return nil }
guard let messages = json["messages"] as? [[String: Any]] else { return nil }
for msg in messages.reversed() {
guard (msg["role"] as? String) == "assistant" else { continue }
if let since, let timestamp = msg["timestamp"] as? Double,
TalkHistoryTimestamp.isAfter(timestamp, sinceSeconds: since) == false
{
continue
}
guard let content = msg["content"] as? [[String: Any]] else { continue }
let text = content.compactMap { $0["text"] as? String }.joined(separator: "\n")
let trimmed = text.trimmingCharacters(in: .whitespacesAndNewlines)
if !trimmed.isEmpty { return trimmed }
}
return nil
}
private func playAssistant(text: String) async {
let parsed = TalkDirectiveParser.parse(text)
let directive = parsed.directive
let cleaned = parsed.stripped.trimmingCharacters(in: .whitespacesAndNewlines)
guard !cleaned.isEmpty else { return }
let requestedVoice = directive?.voiceId?.trimmingCharacters(in: .whitespacesAndNewlines)
let resolvedVoice = self.resolveVoiceAlias(requestedVoice)
if requestedVoice?.isEmpty == false, resolvedVoice == nil {
self.logger.warning("unknown voice alias \(requestedVoice ?? "?", privacy: .public)")
}
if let voice = resolvedVoice {
if directive?.once != true {
self.currentVoiceId = voice
self.voiceOverrideActive = true
}
}
if let model = directive?.modelId {
if directive?.once != true {
self.currentModelId = model
self.modelOverrideActive = true
}
}
self.statusText = "Generating voice…"
self.isSpeaking = true
self.lastSpokenText = cleaned
do {
let started = Date()
let language = ElevenLabsTTSClient.validatedLanguage(directive?.language)
let resolvedKey =
(self.apiKey?.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty == false ? self.apiKey : nil) ??
ProcessInfo.processInfo.environment["ELEVENLABS_API_KEY"]
let apiKey = resolvedKey?.trimmingCharacters(in: .whitespacesAndNewlines)
let preferredVoice = resolvedVoice ?? self.currentVoiceId ?? self.defaultVoiceId
let voiceId: String? = if let apiKey, !apiKey.isEmpty {
await self.resolveVoiceId(preferred: preferredVoice, apiKey: apiKey)
} else {
nil
}
let canUseElevenLabs = (voiceId?.isEmpty == false) && (apiKey?.isEmpty == false)
if canUseElevenLabs, let voiceId, let apiKey {
let desiredOutputFormat = (directive?.outputFormat ?? self.defaultOutputFormat)?
.trimmingCharacters(in: .whitespacesAndNewlines)
let requestedOutputFormat = (desiredOutputFormat?.isEmpty == false) ? desiredOutputFormat : nil
let outputFormat = ElevenLabsTTSClient.validatedOutputFormat(requestedOutputFormat ?? "pcm_44100")
if outputFormat == nil, let requestedOutputFormat {
self.logger.warning(
"talk output_format unsupported for local playback: \(requestedOutputFormat, privacy: .public)")
}
let modelId = directive?.modelId ?? self.currentModelId ?? self.defaultModelId
func makeRequest(outputFormat: String?) -> ElevenLabsTTSRequest {
ElevenLabsTTSRequest(
text: cleaned,
modelId: modelId,
outputFormat: outputFormat,
speed: TalkTTSValidation.resolveSpeed(speed: directive?.speed, rateWPM: directive?.rateWPM),
stability: TalkTTSValidation.validatedStability(directive?.stability, modelId: modelId),
similarity: TalkTTSValidation.validatedUnit(directive?.similarity),
style: TalkTTSValidation.validatedUnit(directive?.style),
speakerBoost: directive?.speakerBoost,
seed: TalkTTSValidation.validatedSeed(directive?.seed),
normalize: ElevenLabsTTSClient.validatedNormalize(directive?.normalize),
language: language,
latencyTier: TalkTTSValidation.validatedLatencyTier(directive?.latencyTier))
}
let request = makeRequest(outputFormat: outputFormat)
let client = ElevenLabsTTSClient(apiKey: apiKey)
let stream = client.streamSynthesize(voiceId: voiceId, request: request)
if self.interruptOnSpeech {
do {
try self.startRecognition()
} catch {
self.logger.warning(
"startRecognition during speak failed: \(error.localizedDescription, privacy: .public)")
}
}
self.statusText = "Speaking…"
let sampleRate = TalkTTSValidation.pcmSampleRate(from: outputFormat)
let result: StreamingPlaybackResult
if let sampleRate {
self.lastPlaybackWasPCM = true
var playback = await self.pcmPlayer.play(stream: stream, sampleRate: sampleRate)
if !playback.finished, playback.interruptedAt == nil {
let mp3Format = ElevenLabsTTSClient.validatedOutputFormat("mp3_44100")
self.logger.warning("pcm playback failed; retrying mp3")
self.lastPlaybackWasPCM = false
let mp3Stream = client.streamSynthesize(
voiceId: voiceId,
request: makeRequest(outputFormat: mp3Format))
playback = await self.mp3Player.play(stream: mp3Stream)
}
result = playback
} else {
self.lastPlaybackWasPCM = false
result = await self.mp3Player.play(stream: stream)
}
self.logger
.info(
"elevenlabs stream finished=\(result.finished, privacy: .public) dur=\(Date().timeIntervalSince(started), privacy: .public)s")
if !result.finished, let interruptedAt = result.interruptedAt {
self.lastInterruptedAtSeconds = interruptedAt
}
} else {
self.logger.warning("tts unavailable; falling back to system voice (missing key or voiceId)")
if self.interruptOnSpeech {
do {
try self.startRecognition()
} catch {
self.logger.warning(
"startRecognition during speak failed: \(error.localizedDescription, privacy: .public)")
}
}
self.statusText = "Speaking (System)…"
try await TalkSystemSpeechSynthesizer.shared.speak(text: cleaned, language: language)
}
} catch {
self.logger.error(
"tts failed: \(error.localizedDescription, privacy: .public); falling back to system voice")
do {
if self.interruptOnSpeech {
do {
try self.startRecognition()
} catch {
self.logger.warning(
"startRecognition during speak failed: \(error.localizedDescription, privacy: .public)")
}
}
self.statusText = "Speaking (System)…"
let language = ElevenLabsTTSClient.validatedLanguage(directive?.language)
try await TalkSystemSpeechSynthesizer.shared.speak(text: cleaned, language: language)
} catch {
self.statusText = "Speak failed: \(error.localizedDescription)"
self.logger.error("system voice failed: \(error.localizedDescription, privacy: .public)")
}
}
self.stopRecognition()
self.isSpeaking = false
}
private func stopSpeaking(storeInterruption: Bool = true) {
guard self.isSpeaking else { return }
let interruptedAt = self.lastPlaybackWasPCM
? self.pcmPlayer.stop()
: self.mp3Player.stop()
if storeInterruption {
self.lastInterruptedAtSeconds = interruptedAt
}
_ = self.lastPlaybackWasPCM
? self.mp3Player.stop()
: self.pcmPlayer.stop()
TalkSystemSpeechSynthesizer.shared.stop()
self.isSpeaking = false
}
private func shouldInterrupt(with transcript: String) -> Bool {
let trimmed = transcript.trimmingCharacters(in: .whitespacesAndNewlines)
guard trimmed.count >= 3 else { return false }
if let spoken = self.lastSpokenText?.lowercased(), spoken.contains(trimmed.lowercased()) {
return false
}
return true
}
private func resolveVoiceAlias(_ value: String?) -> String? {
let trimmed = (value ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let normalized = trimmed.lowercased()
if let mapped = self.voiceAliases[normalized] { return mapped }
if self.voiceAliases.values.contains(where: { $0.caseInsensitiveCompare(trimmed) == .orderedSame }) {
return trimmed
}
return Self.isLikelyVoiceId(trimmed) ? trimmed : nil
}
private func resolveVoiceId(preferred: String?, apiKey: String) async -> String? {
let trimmed = preferred?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
if !trimmed.isEmpty {
if let resolved = self.resolveVoiceAlias(trimmed) { return resolved }
self.logger.warning("unknown voice alias \(trimmed, privacy: .public)")
}
if let fallbackVoiceId { return fallbackVoiceId }
do {
let voices = try await ElevenLabsTTSClient(apiKey: apiKey).listVoices()
guard let first = voices.first else {
self.logger.warning("elevenlabs voices list empty")
return nil
}
self.fallbackVoiceId = first.voiceId
if self.defaultVoiceId == nil {
self.defaultVoiceId = first.voiceId
}
if !self.voiceOverrideActive {
self.currentVoiceId = first.voiceId
}
let name = first.name ?? "unknown"
self.logger
.info("default voice selected \(name, privacy: .public) (\(first.voiceId, privacy: .public))")
return first.voiceId
} catch {
self.logger.error("elevenlabs list voices failed: \(error.localizedDescription, privacy: .public)")
return nil
}
}
private static func isLikelyVoiceId(_ value: String) -> Bool {
guard value.count >= 10 else { return false }
return value.allSatisfy { $0.isLetter || $0.isNumber || $0 == "-" || $0 == "_" }
}
private func reloadConfig() async {
guard let bridge else { return }
do {
let res = try await bridge.request(method: "config.get", paramsJSON: "{}", timeoutSeconds: 8)
guard let json = try JSONSerialization.jsonObject(with: res) as? [String: Any] else { return }
guard let config = json["config"] as? [String: Any] else { return }
let talk = config["talk"] as? [String: Any]
let session = config["session"] as? [String: Any]
let rawMainKey = (session?["mainKey"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
self.mainSessionKey = rawMainKey.isEmpty ? "main" : rawMainKey
self.defaultVoiceId = (talk?["voiceId"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines)
if let aliases = talk?["voiceAliases"] as? [String: Any] {
var resolved: [String: String] = [:]
for (key, value) in aliases {
guard let id = value as? String else { continue }
let normalizedKey = key.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
let trimmedId = id.trimmingCharacters(in: .whitespacesAndNewlines)
guard !normalizedKey.isEmpty, !trimmedId.isEmpty else { continue }
resolved[normalizedKey] = trimmedId
}
self.voiceAliases = resolved
} else {
self.voiceAliases = [:]
}
if !self.voiceOverrideActive {
self.currentVoiceId = self.defaultVoiceId
}
let model = (talk?["modelId"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines)
self.defaultModelId = (model?.isEmpty == false) ? model : Self.defaultModelIdFallback
if !self.modelOverrideActive {
self.currentModelId = self.defaultModelId
}
self.defaultOutputFormat = (talk?["outputFormat"] as? String)?
.trimmingCharacters(in: .whitespacesAndNewlines)
self.apiKey = (talk?["apiKey"] as? String)?.trimmingCharacters(in: .whitespacesAndNewlines)
if let interrupt = talk?["interruptOnSpeech"] as? Bool {
self.interruptOnSpeech = interrupt
}
} catch {
self.defaultModelId = Self.defaultModelIdFallback
if !self.modelOverrideActive {
self.currentModelId = self.defaultModelId
}
}
}
private static func configureAudioSession() throws {
let session = AVAudioSession.sharedInstance()
try session.setCategory(.playAndRecord, mode: .voiceChat, options: [
.duckOthers,
.mixWithOthers,
.allowBluetoothHFP,
.defaultToSpeaker,
])
try session.setActive(true, options: [])
}
private nonisolated static func requestMicrophonePermission() async -> Bool {
await withCheckedContinuation(isolation: nil) { cont in
AVAudioApplication.requestRecordPermission { ok in
cont.resume(returning: ok)
}
}
}
private nonisolated static func requestSpeechPermission() async -> Bool {
await withCheckedContinuation(isolation: nil) { cont in
SFSpeechRecognizer.requestAuthorization { status in
cont.resume(returning: status == .authorized)
}
}
}
}

View File

@ -0,0 +1,70 @@
import SwiftUI
struct TalkOrbOverlay: View {
@Environment(NodeAppModel.self) private var appModel
@State private var pulse: Bool = false
var body: some View {
let seam = self.appModel.seamColor
let status = self.appModel.talkMode.statusText.trimmingCharacters(in: .whitespacesAndNewlines)
VStack(spacing: 14) {
ZStack {
Circle()
.stroke(seam.opacity(0.26), lineWidth: 2)
.frame(width: 320, height: 320)
.scaleEffect(self.pulse ? 1.15 : 0.96)
.opacity(self.pulse ? 0.0 : 1.0)
.animation(.easeOut(duration: 1.3).repeatForever(autoreverses: false), value: self.pulse)
Circle()
.stroke(seam.opacity(0.18), lineWidth: 2)
.frame(width: 320, height: 320)
.scaleEffect(self.pulse ? 1.45 : 1.02)
.opacity(self.pulse ? 0.0 : 0.9)
.animation(.easeOut(duration: 1.9).repeatForever(autoreverses: false).delay(0.2), value: self.pulse)
Circle()
.fill(
RadialGradient(
colors: [
seam.opacity(0.95),
seam.opacity(0.40),
Color.black.opacity(0.55),
],
center: .center,
startRadius: 1,
endRadius: 112))
.frame(width: 190, height: 190)
.overlay(
Circle()
.stroke(seam.opacity(0.35), lineWidth: 1))
.shadow(color: seam.opacity(0.32), radius: 26, x: 0, y: 0)
.shadow(color: Color.black.opacity(0.50), radius: 22, x: 0, y: 10)
}
.contentShape(Circle())
.onTapGesture {
self.appModel.talkMode.userTappedOrb()
}
if !status.isEmpty, status != "Off" {
Text(status)
.font(.system(.footnote, design: .rounded).weight(.semibold))
.foregroundStyle(Color.white.opacity(0.92))
.padding(.horizontal, 12)
.padding(.vertical, 8)
.background(
Capsule()
.fill(Color.black.opacity(0.40))
.overlay(
Capsule().stroke(seam.opacity(0.22), lineWidth: 1)))
}
}
.padding(28)
.onAppear {
self.pulse = true
}
.accessibilityElement(children: .combine)
.accessibilityLabel("Talk Mode \(status)")
}
}

View File

@ -4,6 +4,7 @@ struct VoiceTab: View {
@Environment(NodeAppModel.self) private var appModel
@Environment(VoiceWakeManager.self) private var voiceWake
@AppStorage("voiceWake.enabled") private var voiceWakeEnabled: Bool = false
@AppStorage("talk.enabled") private var talkEnabled: Bool = false
var body: some View {
NavigationStack {
@ -14,6 +15,7 @@ struct VoiceTab: View {
Text(self.voiceWake.statusText)
.font(.footnote)
.foregroundStyle(.secondary)
LabeledContent("Talk Mode", value: self.talkEnabled ? "Enabled" : "Disabled")
}
Section("Notes") {
@ -36,6 +38,9 @@ struct VoiceTab: View {
.onChange(of: self.voiceWakeEnabled) { _, newValue in
self.appModel.setVoiceWakeEnabled(newValue)
}
.onChange(of: self.talkEnabled) { _, newValue in
self.appModel.setTalkEnabled(newValue)
}
}
}
}

View File

@ -54,4 +54,7 @@ Sources/Voice/VoiceWakePreferences.swift
../shared/ClawdisKit/Sources/ClawdisKit/ScreenCommands.swift
../shared/ClawdisKit/Sources/ClawdisKit/StoragePaths.swift
../shared/ClawdisKit/Sources/ClawdisKit/SystemCommands.swift
../shared/ClawdisKit/Sources/ClawdisKit/TalkDirective.swift
../../Swabble/Sources/SwabbleKit/WakeWordGate.swift
Sources/Voice/TalkModeManager.swift
Sources/Voice/TalkOrbOverlay.swift

View File

@ -1,5 +1,6 @@
import ClawdisKit
import Foundation
import Network
import Testing
import UIKit
@testable import Clawdis
@ -15,6 +16,25 @@ private let instanceIdEntry = KeychainEntry(service: nodeService, account: "inst
private let preferredBridgeEntry = KeychainEntry(service: bridgeService, account: "preferredStableID")
private let lastBridgeEntry = KeychainEntry(service: bridgeService, account: "lastDiscoveredStableID")
private actor MockBridgePairingClient: BridgePairingClient {
private(set) var lastToken: String?
private let resultToken: String
init(resultToken: String) {
self.resultToken = resultToken
}
func pairAndHello(
endpoint: NWEndpoint,
hello: BridgeHello,
onStatus: (@Sendable (String) -> Void)?) async throws -> String
{
self.lastToken = hello.token
onStatus?("Testing…")
return self.resultToken
}
}
private func withUserDefaults<T>(_ updates: [String: Any?], _ body: () throws -> T) rethrows -> T {
let defaults = UserDefaults.standard
var snapshot: [String: Any?] = [:]
@ -40,6 +60,35 @@ private func withUserDefaults<T>(_ updates: [String: Any?], _ body: () throws ->
return try body()
}
@MainActor
private func withUserDefaults<T>(
_ updates: [String: Any?],
_ body: () async throws -> T) async rethrows -> T
{
let defaults = UserDefaults.standard
var snapshot: [String: Any?] = [:]
for key in updates.keys {
snapshot[key] = defaults.object(forKey: key)
}
for (key, value) in updates {
if let value {
defaults.set(value, forKey: key)
} else {
defaults.removeObject(forKey: key)
}
}
defer {
for (key, value) in snapshot {
if let value {
defaults.set(value, forKey: key)
} else {
defaults.removeObject(forKey: key)
}
}
}
return try await body()
}
private func withKeychainValues<T>(_ updates: [KeychainEntry: String?], _ body: () throws -> T) rethrows -> T {
var snapshot: [KeychainEntry: String?] = [:]
for entry in updates.keys {
@ -64,6 +113,34 @@ private func withKeychainValues<T>(_ updates: [KeychainEntry: String?], _ body:
return try body()
}
@MainActor
private func withKeychainValues<T>(
_ updates: [KeychainEntry: String?],
_ body: () async throws -> T) async rethrows -> T
{
var snapshot: [KeychainEntry: String?] = [:]
for entry in updates.keys {
snapshot[entry] = KeychainStore.loadString(service: entry.service, account: entry.account)
}
for (entry, value) in updates {
if let value {
_ = KeychainStore.saveString(value, service: entry.service, account: entry.account)
} else {
_ = KeychainStore.delete(service: entry.service, account: entry.account)
}
}
defer {
for (entry, value) in snapshot {
if let value {
_ = KeychainStore.saveString(value, service: entry.service, account: entry.account)
} else {
_ = KeychainStore.delete(service: entry.service, account: entry.account)
}
}
}
return try await body()
}
@Suite(.serialized) struct BridgeConnectionControllerTests {
@Test @MainActor func resolvedDisplayNameSetsDefaultWhenMissing() {
let defaults = UserDefaults.standard
@ -156,4 +233,109 @@ private func withKeychainValues<T>(_ updates: [KeychainEntry: String?], _ body:
}
}
}
@Test @MainActor func autoConnectRefreshesTokenOnUnauthorized() async {
let bridge = BridgeDiscoveryModel.DiscoveredBridge(
name: "Gateway",
endpoint: .hostPort(host: NWEndpoint.Host("127.0.0.1"), port: 18790),
stableID: "bridge-1",
debugID: "bridge-debug",
lanHost: "Mac.local",
tailnetDns: nil,
gatewayPort: 18789,
bridgePort: 18790,
canvasPort: 18793,
cliPath: nil)
let mock = MockBridgePairingClient(resultToken: "new-token")
let account = "bridge-token.ios-test"
await withKeychainValues([
instanceIdEntry: nil,
preferredBridgeEntry: nil,
lastBridgeEntry: nil,
KeychainEntry(service: bridgeService, account: account): "old-token",
]) {
await withUserDefaults([
"node.instanceId": "ios-test",
"bridge.lastDiscoveredStableID": "bridge-1",
"bridge.manual.enabled": false,
]) {
let appModel = NodeAppModel()
let controller = BridgeConnectionController(
appModel: appModel,
startDiscovery: false,
bridgeClientFactory: { mock })
controller._test_setBridges([bridge])
controller._test_triggerAutoConnect()
for _ in 0..<20 {
if appModel.connectedBridgeID == bridge.stableID { break }
try? await Task.sleep(nanoseconds: 50_000_000)
}
#expect(appModel.connectedBridgeID == bridge.stableID)
let stored = KeychainStore.loadString(service: bridgeService, account: account)
#expect(stored == "new-token")
let lastToken = await mock.lastToken
#expect(lastToken == "old-token")
}
}
}
@Test @MainActor func autoConnectPrefersPreferredBridgeOverLastDiscovered() async {
let bridgeA = BridgeDiscoveryModel.DiscoveredBridge(
name: "Gateway A",
endpoint: .hostPort(host: NWEndpoint.Host("127.0.0.1"), port: 18790),
stableID: "bridge-1",
debugID: "bridge-a",
lanHost: "MacA.local",
tailnetDns: nil,
gatewayPort: 18789,
bridgePort: 18790,
canvasPort: 18793,
cliPath: nil)
let bridgeB = BridgeDiscoveryModel.DiscoveredBridge(
name: "Gateway B",
endpoint: .hostPort(host: NWEndpoint.Host("127.0.0.1"), port: 28790),
stableID: "bridge-2",
debugID: "bridge-b",
lanHost: "MacB.local",
tailnetDns: nil,
gatewayPort: 28789,
bridgePort: 28790,
canvasPort: 28793,
cliPath: nil)
let mock = MockBridgePairingClient(resultToken: "token-ok")
let account = "bridge-token.ios-test"
await withKeychainValues([
instanceIdEntry: nil,
preferredBridgeEntry: nil,
lastBridgeEntry: nil,
KeychainEntry(service: bridgeService, account: account): "old-token",
]) {
await withUserDefaults([
"node.instanceId": "ios-test",
"bridge.preferredStableID": "bridge-2",
"bridge.lastDiscoveredStableID": "bridge-1",
"bridge.manual.enabled": false,
]) {
let appModel = NodeAppModel()
let controller = BridgeConnectionController(
appModel: appModel,
startDiscovery: false,
bridgeClientFactory: { mock })
controller._test_setBridges([bridgeA, bridgeB])
controller._test_triggerAutoConnect()
for _ in 0..<20 {
if appModel.connectedBridgeID == bridgeB.stableID { break }
try? await Task.sleep(nanoseconds: 50_000_000)
}
#expect(appModel.connectedBridgeID == bridgeB.stableID)
}
}
}
}

View File

@ -0,0 +1,13 @@
import Testing
@testable import Clawdis
@Suite struct CameraControllerErrorTests {
@Test func errorDescriptionsAreStable() {
#expect(CameraController.CameraError.cameraUnavailable.errorDescription == "Camera unavailable")
#expect(CameraController.CameraError.microphoneUnavailable.errorDescription == "Microphone unavailable")
#expect(CameraController.CameraError.permissionDenied(kind: "Camera").errorDescription == "Camera permission denied")
#expect(CameraController.CameraError.invalidParams("bad").errorDescription == "bad")
#expect(CameraController.CameraError.captureFailed("nope").errorDescription == "nope")
#expect(CameraController.CameraError.exportFailed("export").errorDescription == "export")
}
}

View File

@ -16,6 +16,15 @@ import WebKit
#expect(scrollView.bounces == false)
}
@Test @MainActor func navigateEnablesScrollForWebPages() {
let screen = ScreenController()
screen.navigate(to: "https://example.com")
let scrollView = screen.webView.scrollView
#expect(scrollView.isScrollEnabled == true)
#expect(scrollView.bounces == true)
}
@Test @MainActor func navigateSlashShowsDefaultCanvas() {
let screen = ScreenController()
screen.navigate(to: "/")

View File

@ -62,7 +62,11 @@ targets:
swiftlint lint --config "$SRCROOT/.swiftlint.yml" --use-script-input-file-lists
settings:
base:
CODE_SIGN_IDENTITY: "Apple Development"
CODE_SIGN_STYLE: Manual
DEVELOPMENT_TEAM: Y5PE65HELJ
PRODUCT_BUNDLE_IDENTIFIER: com.steipete.clawdis.ios
PROVISIONING_PROFILE_SPECIFIER: "com.steipete.clawdis.ios Development"
SWIFT_VERSION: "6.0"
info:
path: Sources/Info.plist

View File

@ -15,6 +15,7 @@ let package = Package(
dependencies: [
.package(url: "https://github.com/orchetect/MenuBarExtraAccess", exact: "1.2.2"),
.package(url: "https://github.com/swiftlang/swift-subprocess.git", from: "0.1.0"),
.package(url: "https://github.com/apple/swift-log.git", from: "1.8.0"),
.package(url: "https://github.com/sparkle-project/Sparkle", from: "2.8.1"),
.package(path: "../shared/ClawdisKit"),
.package(path: "../../Swabble"),
@ -45,6 +46,7 @@ let package = Package(
.product(name: "SwabbleKit", package: "swabble"),
.product(name: "MenuBarExtraAccess", package: "MenuBarExtraAccess"),
.product(name: "Subprocess", package: "swift-subprocess"),
.product(name: "Logging", package: "swift-log"),
.product(name: "Sparkle", package: "Sparkle"),
.product(name: "PeekabooBridge", package: "PeekabooCore"),
.product(name: "PeekabooAutomationKit", package: "PeekabooAutomationKit"),

View File

@ -121,6 +121,18 @@ final class AppState {
forKey: voicePushToTalkEnabledKey) } }
}
var talkEnabled: Bool {
didSet {
self.ifNotPreview {
UserDefaults.standard.set(self.talkEnabled, forKey: talkEnabledKey)
Task { await TalkModeController.shared.setEnabled(self.talkEnabled) }
}
}
}
/// Gateway-provided UI accent color (hex). Optional; clients provide a default.
var seamColorHex: String?
var iconOverride: IconOverrideSelection {
didSet { self.ifNotPreview { UserDefaults.standard.set(self.iconOverride.rawValue, forKey: iconOverrideKey) } }
}
@ -216,6 +228,8 @@ final class AppState {
.stringArray(forKey: voiceWakeAdditionalLocalesKey) ?? []
self.voicePushToTalkEnabled = UserDefaults.standard
.object(forKey: voicePushToTalkEnabledKey) as? Bool ?? false
self.talkEnabled = UserDefaults.standard.bool(forKey: talkEnabledKey)
self.seamColorHex = nil
if let storedHeartbeats = UserDefaults.standard.object(forKey: heartbeatsEnabledKey) as? Bool {
self.heartbeatsEnabled = storedHeartbeats
} else {
@ -256,9 +270,13 @@ final class AppState {
if self.swabbleEnabled, !PermissionManager.voiceWakePermissionsGranted() {
self.swabbleEnabled = false
}
if self.talkEnabled, !PermissionManager.voiceWakePermissionsGranted() {
self.talkEnabled = false
}
if !self.isPreview {
Task { await VoiceWakeRuntime.shared.refresh(state: self) }
Task { await TalkModeController.shared.setEnabled(self.talkEnabled) }
}
}
@ -312,6 +330,31 @@ final class AppState {
Task { await VoiceWakeRuntime.shared.refresh(state: self) }
}
func setTalkEnabled(_ enabled: Bool) async {
guard voiceWakeSupported else {
self.talkEnabled = false
await GatewayConnection.shared.talkMode(enabled: false, phase: "disabled")
return
}
self.talkEnabled = enabled
guard !self.isPreview else { return }
if !enabled {
await GatewayConnection.shared.talkMode(enabled: false, phase: "disabled")
return
}
if PermissionManager.voiceWakePermissionsGranted() {
await GatewayConnection.shared.talkMode(enabled: true, phase: "enabled")
return
}
let granted = await PermissionManager.ensureVoiceWakePermissions(interactive: true)
self.talkEnabled = granted
await GatewayConnection.shared.talkMode(enabled: granted, phase: granted ? "enabled" : "denied")
}
// MARK: - Global wake words sync (Gateway-owned)
func applyGlobalVoiceWakeTriggers(_ triggers: [String]) {
@ -367,6 +410,7 @@ extension AppState {
state.voiceWakeLocaleID = Locale.current.identifier
state.voiceWakeAdditionalLocaleIDs = ["en-US", "de-DE"]
state.voicePushToTalkEnabled = false
state.talkEnabled = false
state.iconOverride = .system
state.heartbeatsEnabled = true
state.connectionMode = .local

View File

@ -79,7 +79,14 @@ actor CameraCaptureService {
}
withExtendedLifetime(delegate) {}
let res = try JPEGTranscoder.transcodeToJPEG(imageData: rawData, maxWidthPx: maxWidth, quality: quality)
let maxPayloadBytes = 5 * 1024 * 1024
// Base64 inflates payloads by ~4/3; cap encoded bytes so the payload stays under 5MB (API limit).
let maxEncodedBytes = (maxPayloadBytes / 4) * 3
let res = try JPEGTranscoder.transcodeToJPEG(
imageData: rawData,
maxWidthPx: maxWidth,
quality: quality,
maxBytes: maxEncodedBytes)
return (data: res.data, size: CGSize(width: res.widthPx, height: res.heightPx))
}

View File

@ -190,7 +190,7 @@ final class CanvasManager {
private static func resolveA2UIHostUrl(from raw: String?) -> String? {
let trimmed = raw?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
guard !trimmed.isEmpty, let base = URL(string: trimmed) else { return nil }
return base.appendingPathComponent("__clawdis__/a2ui/").absoluteString
return base.appendingPathComponent("__clawdis__/a2ui/").absoluteString + "?platform=macos"
}
// MARK: - Anchoring

View File

@ -1,5 +1,4 @@
import AppKit
import OSLog
let canvasWindowLogger = Logger(subsystem: "com.steipete.clawdis", category: "Canvas")

View File

@ -1,6 +1,8 @@
import Foundation
enum ClawdisConfigFile {
private static let logger = Logger(subsystem: "com.steipete.clawdis", category: "config")
static func url() -> URL {
FileManager.default.homeDirectoryForCurrentUser
.appendingPathComponent(".clawdis")
@ -15,8 +17,18 @@ enum ClawdisConfigFile {
static func loadDict() -> [String: Any] {
let url = self.url()
guard let data = try? Data(contentsOf: url) else { return [:] }
return (try? JSONSerialization.jsonObject(with: data) as? [String: Any]) ?? [:]
guard FileManager.default.fileExists(atPath: url.path) else { return [:] }
do {
let data = try Data(contentsOf: url)
guard let root = try JSONSerialization.jsonObject(with: data) as? [String: Any] else {
self.logger.warning("config JSON root invalid")
return [:]
}
return root
} catch {
self.logger.warning("config read failed: \(error.localizedDescription)")
return [:]
}
}
static func saveDict(_ dict: [String: Any]) {
@ -28,7 +40,9 @@ enum ClawdisConfigFile {
at: url.deletingLastPathComponent(),
withIntermediateDirectories: true)
try data.write(to: url, options: [.atomic])
} catch {}
} catch {
self.logger.error("config save failed: \(error.localizedDescription)")
}
}
static func loadGatewayDict() -> [String: Any] {
@ -60,6 +74,7 @@ enum ClawdisConfigFile {
browser["enabled"] = enabled
root["browser"] = browser
self.saveDict(root)
self.logger.debug("browser control updated enabled=\(enabled)")
}
static func agentWorkspace() -> String? {
@ -79,5 +94,6 @@ enum ClawdisConfigFile {
}
root["agent"] = agent
self.saveDict(root)
self.logger.debug("agent workspace updated set=\(!trimmed.isEmpty)")
}
}

View File

@ -16,6 +16,10 @@ enum CommandResolver {
RuntimeLocator.resolve(searchPaths: self.preferredPaths())
}
static func runtimeResolution(searchPaths: [String]?) -> Result<RuntimeResolution, RuntimeResolutionError> {
RuntimeLocator.resolve(searchPaths: searchPaths ?? self.preferredPaths())
}
static func makeRuntimeCommand(
runtime: RuntimeResolution,
entrypoint: String,
@ -152,8 +156,8 @@ enum CommandResolver {
return paths
}
static func findExecutable(named name: String) -> String? {
for dir in self.preferredPaths() {
static func findExecutable(named name: String, searchPaths: [String]? = nil) -> String? {
for dir in (searchPaths ?? self.preferredPaths()) {
let candidate = (dir as NSString).appendingPathComponent(name)
if FileManager.default.isExecutableFile(atPath: candidate) {
return candidate
@ -162,8 +166,14 @@ enum CommandResolver {
return nil
}
static func clawdisExecutable() -> String? {
self.findExecutable(named: self.helperName)
static func clawdisExecutable(searchPaths: [String]? = nil) -> String? {
self.findExecutable(named: self.helperName, searchPaths: searchPaths)
}
static func projectClawdisExecutable(projectRoot: URL? = nil) -> String? {
let root = projectRoot ?? self.projectRoot()
let candidate = root.appendingPathComponent("node_modules/.bin").appendingPathComponent(self.helperName).path
return FileManager.default.isExecutableFile(atPath: candidate) ? candidate : nil
}
static func nodeCliPath() -> String? {
@ -171,17 +181,18 @@ enum CommandResolver {
return FileManager.default.isReadableFile(atPath: candidate) ? candidate : nil
}
static func hasAnyClawdisInvoker() -> Bool {
if self.clawdisExecutable() != nil { return true }
if self.findExecutable(named: "pnpm") != nil { return true }
if self.findExecutable(named: "node") != nil, self.nodeCliPath() != nil { return true }
static func hasAnyClawdisInvoker(searchPaths: [String]? = nil) -> Bool {
if self.clawdisExecutable(searchPaths: searchPaths) != nil { return true }
if self.findExecutable(named: "pnpm", searchPaths: searchPaths) != nil { return true }
if self.findExecutable(named: "node", searchPaths: searchPaths) != nil, self.nodeCliPath() != nil { return true }
return false
}
static func clawdisNodeCommand(
subcommand: String,
extraArgs: [String] = [],
defaults: UserDefaults = .standard) -> [String]
defaults: UserDefaults = .standard,
searchPaths: [String]? = nil) -> [String]
{
let settings = self.connectionSettings(defaults: defaults)
if settings.mode == .remote, let ssh = self.sshNodeCommand(
@ -192,25 +203,29 @@ enum CommandResolver {
return ssh
}
let runtimeResult = self.runtimeResolution()
let runtimeResult = self.runtimeResolution(searchPaths: searchPaths)
switch runtimeResult {
case let .success(runtime):
if let clawdisPath = self.clawdisExecutable() {
let root = self.projectRoot()
if let clawdisPath = self.projectClawdisExecutable(projectRoot: root) {
return [clawdisPath, subcommand] + extraArgs
}
if let entry = self.gatewayEntrypoint(in: self.projectRoot()) {
if let entry = self.gatewayEntrypoint(in: root) {
return self.makeRuntimeCommand(
runtime: runtime,
entrypoint: entry,
subcommand: subcommand,
extraArgs: extraArgs)
}
if let pnpm = self.findExecutable(named: "pnpm") {
if let pnpm = self.findExecutable(named: "pnpm", searchPaths: searchPaths) {
// Use --silent to avoid pnpm lifecycle banners that would corrupt JSON outputs.
return [pnpm, "--silent", "clawdis", subcommand] + extraArgs
}
if let clawdisPath = self.clawdisExecutable(searchPaths: searchPaths) {
return [clawdisPath, subcommand] + extraArgs
}
let missingEntry = """
clawdis entrypoint missing (looked for dist/index.js or bin/clawdis.js); run pnpm build.
@ -226,9 +241,10 @@ enum CommandResolver {
static func clawdisCommand(
subcommand: String,
extraArgs: [String] = [],
defaults: UserDefaults = .standard) -> [String]
defaults: UserDefaults = .standard,
searchPaths: [String]? = nil) -> [String]
{
self.clawdisNodeCommand(subcommand: subcommand, extraArgs: extraArgs, defaults: defaults)
self.clawdisNodeCommand(subcommand: subcommand, extraArgs: extraArgs, defaults: defaults, searchPaths: searchPaths)
}
// MARK: - SSH helpers
@ -258,7 +274,7 @@ enum CommandResolver {
"/bin",
"/usr/sbin",
"/sbin",
"/Users/steipete/Library/pnpm",
"$HOME/Library/pnpm",
"$PATH",
].joined(separator: ":")
let quotedArgs = ([subcommand] + extraArgs).map(self.shellQuote).joined(separator: " ")

View File

@ -31,6 +31,12 @@ struct ConfigSettings: View {
@State private var browserColorHex: String = "#FF4500"
@State private var browserAttachOnly: Bool = false
// Talk mode settings (stored in ~/.clawdis/clawdis.json under "talk")
@State private var talkVoiceId: String = ""
@State private var talkInterruptOnSpeech: Bool = true
@State private var talkApiKey: String = ""
@State private var gatewayApiKeyFound = false
var body: some View {
ScrollView { self.content }
.onChange(of: self.modelCatalogPath) { _, _ in
@ -45,6 +51,7 @@ struct ConfigSettings: View {
self.hasLoaded = true
self.loadConfig()
await self.loadModels()
await self.refreshGatewayTalkApiKey()
self.allowAutosave = true
}
}
@ -56,6 +63,8 @@ struct ConfigSettings: View {
.disabled(self.isNixMode)
self.heartbeatSection
.disabled(self.isNixMode)
self.talkSection
.disabled(self.isNixMode)
self.browserSection
.disabled(self.isNixMode)
Spacer(minLength: 0)
@ -272,18 +281,101 @@ struct ConfigSettings: View {
.frame(maxWidth: .infinity, alignment: .leading)
}
private var talkSection: some View {
GroupBox("Talk Mode") {
Grid(alignment: .leadingFirstTextBaseline, horizontalSpacing: 14, verticalSpacing: 10) {
GridRow {
self.gridLabel("Voice ID")
VStack(alignment: .leading, spacing: 6) {
HStack(spacing: 8) {
TextField("ElevenLabs voice ID", text: self.$talkVoiceId)
.textFieldStyle(.roundedBorder)
.frame(maxWidth: .infinity)
.onChange(of: self.talkVoiceId) { _, _ in self.autosaveConfig() }
if !self.talkVoiceSuggestions.isEmpty {
Menu {
ForEach(self.talkVoiceSuggestions, id: \.self) { value in
Button(value) {
self.talkVoiceId = value
self.autosaveConfig()
}
}
} label: {
Label("Suggestions", systemImage: "chevron.up.chevron.down")
}
.fixedSize()
}
}
Text("Defaults to ELEVENLABS_VOICE_ID / SAG_VOICE_ID if unset.")
.font(.footnote)
.foregroundStyle(.secondary)
}
}
GridRow {
self.gridLabel("API key")
VStack(alignment: .leading, spacing: 6) {
HStack(spacing: 8) {
SecureField("ELEVENLABS_API_KEY", text: self.$talkApiKey)
.textFieldStyle(.roundedBorder)
.frame(maxWidth: .infinity)
.disabled(self.hasEnvApiKey)
.onChange(of: self.talkApiKey) { _, _ in self.autosaveConfig() }
if !self.hasEnvApiKey && !self.talkApiKey.isEmpty {
Button("Clear") {
self.talkApiKey = ""
self.autosaveConfig()
}
}
}
self.statusLine(label: self.apiKeyStatusLabel, color: self.apiKeyStatusColor)
if self.hasEnvApiKey {
Text("Using ELEVENLABS_API_KEY from the environment.")
.font(.footnote)
.foregroundStyle(.secondary)
} else if self.gatewayApiKeyFound && self.talkApiKey.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
Text("Using API key from the gateway profile.")
.font(.footnote)
.foregroundStyle(.secondary)
}
}
}
GridRow {
self.gridLabel("Interrupt")
Toggle("Stop speaking when you start talking", isOn: self.$talkInterruptOnSpeech)
.labelsHidden()
.toggleStyle(.checkbox)
.onChange(of: self.talkInterruptOnSpeech) { _, _ in self.autosaveConfig() }
}
}
}
.frame(maxWidth: .infinity, alignment: .leading)
}
private func gridLabel(_ text: String) -> some View {
Text(text)
.foregroundStyle(.secondary)
.frame(width: self.labelColumnWidth, alignment: .leading)
}
private func statusLine(label: String, color: Color) -> some View {
HStack(spacing: 6) {
Circle()
.fill(color)
.frame(width: 6, height: 6)
Text(label)
.font(.footnote)
.foregroundStyle(.secondary)
}
.padding(.top, 2)
}
private func loadConfig() {
let parsed = self.loadConfigDict()
let agent = parsed["agent"] as? [String: Any]
let heartbeatMinutes = agent?["heartbeatMinutes"] as? Int
let heartbeatBody = agent?["heartbeatBody"] as? String
let browser = parsed["browser"] as? [String: Any]
let talk = parsed["talk"] as? [String: Any]
let loadedModel = (agent?["model"] as? String) ?? ""
if !loadedModel.isEmpty {
@ -303,6 +395,28 @@ struct ConfigSettings: View {
if let color = browser["color"] as? String, !color.isEmpty { self.browserColorHex = color }
if let attachOnly = browser["attachOnly"] as? Bool { self.browserAttachOnly = attachOnly }
}
if let talk {
if let voice = talk["voiceId"] as? String { self.talkVoiceId = voice }
if let apiKey = talk["apiKey"] as? String { self.talkApiKey = apiKey }
if let interrupt = talk["interruptOnSpeech"] as? Bool {
self.talkInterruptOnSpeech = interrupt
}
}
}
private func refreshGatewayTalkApiKey() async {
do {
let snap: ConfigSnapshot = try await GatewayConnection.shared.requestDecoded(
method: .configGet,
params: nil,
timeoutMs: 8000)
let talk = snap.config?["talk"]?.dictionaryValue
let apiKey = talk?["apiKey"]?.stringValue?.trimmingCharacters(in: .whitespacesAndNewlines)
self.gatewayApiKeyFound = !(apiKey ?? "").isEmpty
} catch {
self.gatewayApiKeyFound = false
}
}
private func autosaveConfig() {
@ -318,6 +432,7 @@ struct ConfigSettings: View {
var root = self.loadConfigDict()
var agent = root["agent"] as? [String: Any] ?? [:]
var browser = root["browser"] as? [String: Any] ?? [:]
var talk = root["talk"] as? [String: Any] ?? [:]
let chosenModel = (self.configModel == "__custom__" ? self.customModel : self.configModel)
.trimmingCharacters(in: .whitespacesAndNewlines)
@ -343,6 +458,21 @@ struct ConfigSettings: View {
browser["attachOnly"] = self.browserAttachOnly
root["browser"] = browser
let trimmedVoice = self.talkVoiceId.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmedVoice.isEmpty {
talk.removeValue(forKey: "voiceId")
} else {
talk["voiceId"] = trimmedVoice
}
let trimmedApiKey = self.talkApiKey.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmedApiKey.isEmpty {
talk.removeValue(forKey: "apiKey")
} else {
talk["apiKey"] = trimmedApiKey
}
talk["interruptOnSpeech"] = self.talkInterruptOnSpeech
root["talk"] = talk
ClawdisConfigFile.saveDict(root)
}
@ -360,6 +490,41 @@ struct ConfigSettings: View {
return Color(red: r, green: g, blue: b)
}
private var talkVoiceSuggestions: [String] {
let env = ProcessInfo.processInfo.environment
let candidates = [
self.talkVoiceId,
env["ELEVENLABS_VOICE_ID"] ?? "",
env["SAG_VOICE_ID"] ?? "",
]
var seen = Set<String>()
return candidates
.map { $0.trimmingCharacters(in: .whitespacesAndNewlines) }
.filter { !$0.isEmpty }
.filter { seen.insert($0).inserted }
}
private var hasEnvApiKey: Bool {
let raw = ProcessInfo.processInfo.environment["ELEVENLABS_API_KEY"] ?? ""
return !raw.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
}
private var apiKeyStatusLabel: String {
if self.hasEnvApiKey { return "ElevenLabs API key: found (environment)" }
if !self.talkApiKey.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty {
return "ElevenLabs API key: stored in config"
}
if self.gatewayApiKeyFound { return "ElevenLabs API key: found (gateway)" }
return "ElevenLabs API key: missing"
}
private var apiKeyStatusColor: Color {
if self.hasEnvApiKey { return .green }
if !self.talkApiKey.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty { return .green }
if self.gatewayApiKeyFound { return .green }
return .red
}
private var browserPathLabel: String? {
guard self.browserEnabled else { return nil }

View File

@ -294,6 +294,11 @@ final class ConnectionsStore {
: nil
self.configRoot = snap.config?.mapValues { $0.foundationValue } ?? [:]
self.configLoaded = true
let ui = snap.config?["ui"]?.dictionaryValue
let rawSeam = ui?["seamColor"]?.stringValue?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
AppStateStore.shared.seamColorHex = rawSeam.isEmpty ? nil : rawSeam
let telegram = snap.config?["telegram"]?.dictionaryValue
self.telegramToken = telegram?["botToken"]?.stringValue ?? ""
self.telegramRequireMention = telegram?["requireMention"]?.boolValue ?? true

View File

@ -16,6 +16,7 @@ let voiceWakeMicKey = "clawdis.voiceWakeMicID"
let voiceWakeLocaleKey = "clawdis.voiceWakeLocaleID"
let voiceWakeAdditionalLocalesKey = "clawdis.voiceWakeAdditionalLocaleIDs"
let voicePushToTalkEnabledKey = "clawdis.voicePushToTalkEnabled"
let talkEnabledKey = "clawdis.talkEnabled"
let iconOverrideKey = "clawdis.iconOverride"
let connectionModeKey = "clawdis.connectionMode"
let remoteTargetKey = "clawdis.remoteTarget"
@ -31,5 +32,6 @@ let modelCatalogReloadKey = "clawdis.modelCatalogReload"
let attachExistingGatewayOnlyKey = "clawdis.gateway.attachExistingOnly"
let heartbeatsEnabledKey = "clawdis.heartbeatsEnabled"
let debugFileLogEnabledKey = "clawdis.debug.fileLogEnabled"
let appLogLevelKey = "clawdis.debug.appLogLevel"
let voiceWakeSupported: Bool = ProcessInfo.processInfo.operatingSystemVersion.majorVersion >= 26
let cliHelperSearchPaths = ["/usr/local/bin", "/opt/homebrew/bin"]

View File

@ -1,7 +1,6 @@
import ClawdisProtocol
import Foundation
import Observation
import OSLog
import SwiftUI
struct ControlHeartbeatEvent: Codable {

View File

@ -1,8 +1,10 @@
import AppKit
import Observation
import SwiftUI
import UniformTypeIdentifiers
struct DebugSettings: View {
@Bindable var state: AppState
private let isPreview = ProcessInfo.processInfo.isPreview
private let labelColumnWidth: CGFloat = 140
@AppStorage(modelCatalogPathKey) private var modelCatalogPath: String = ModelCatalogLoader.defaultPath
@ -28,6 +30,7 @@ struct DebugSettings: View {
@State private var pendingKill: DebugActions.PortListener?
@AppStorage(attachExistingGatewayOnlyKey) private var attachExistingGatewayOnly: Bool = false
@AppStorage(debugFileLogEnabledKey) private var diagnosticsFileLogEnabled: Bool = false
@AppStorage(appLogLevelKey) private var appLogLevelRaw: String = AppLogLevel.default.rawValue
@State private var canvasSessionKey: String = "main"
@State private var canvasStatus: String?
@ -36,6 +39,10 @@ struct DebugSettings: View {
@State private var canvasEvalResult: String?
@State private var canvasSnapshotPath: String?
init(state: AppState = AppStateStore.shared) {
self.state = state
}
var body: some View {
ScrollView(.vertical) {
VStack(alignment: .leading, spacing: 14) {
@ -194,7 +201,9 @@ struct DebugSettings: View {
.overlay(RoundedRectangle(cornerRadius: 6).stroke(Color.secondary.opacity(0.2)))
HStack(spacing: 8) {
Button("Restart Gateway") { DebugActions.restartGateway() }
if self.canRestartGateway {
Button("Restart Gateway") { DebugActions.restartGateway() }
}
Button("Clear log") { GatewayProcessManager.shared.clearLog() }
Spacer(minLength: 0)
}
@ -224,13 +233,23 @@ struct DebugSettings: View {
}
GridRow {
self.gridLabel("Diagnostics")
VStack(alignment: .leading, spacing: 6) {
self.gridLabel("App logging")
VStack(alignment: .leading, spacing: 8) {
Picker("Verbosity", selection: self.$appLogLevelRaw) {
ForEach(AppLogLevel.allCases) { level in
Text(level.title).tag(level.rawValue)
}
}
.pickerStyle(.menu)
.labelsHidden()
.help("Controls the macOS app log verbosity.")
Toggle("Write rolling diagnostics log (JSONL)", isOn: self.$diagnosticsFileLogEnabled)
.toggleStyle(.checkbox)
.help(
"Writes a rotating, local-only diagnostics log under ~/Library/Logs/Clawdis/. " +
"Writes a rotating, local-only log under ~/Library/Logs/Clawdis/. " +
"Enable only while actively debugging.")
HStack(spacing: 8) {
Button("Open folder") {
NSWorkspace.shared.open(DiagnosticsFileLog.logDirectoryURL())
@ -762,6 +781,10 @@ struct DebugSettings: View {
CommandResolver.connectionSettings().mode == .remote
}
private var canRestartGateway: Bool {
self.state.connectionMode == .local && !self.attachExistingGatewayOnly
}
private func configURL() -> URL {
FileManager.default.homeDirectoryForCurrentUser
.appendingPathComponent(".clawdis")
@ -902,7 +925,7 @@ private struct PlainSettingsGroupBoxStyle: GroupBoxStyle {
#if DEBUG
struct DebugSettings_Previews: PreviewProvider {
static var previews: some View {
DebugSettings()
DebugSettings(state: .preview)
.frame(width: SettingsTab.windowWidth, height: SettingsTab.windowHeight)
}
}
@ -910,7 +933,7 @@ struct DebugSettings_Previews: PreviewProvider {
@MainActor
extension DebugSettings {
static func exerciseForTesting() async {
let view = DebugSettings()
let view = DebugSettings(state: .preview)
view.modelsCount = 3
view.modelsLoading = false
view.modelsError = "Failed to load models"

View File

@ -7,6 +7,8 @@ struct DevicePresentation: Sendable {
enum DeviceModelCatalog {
private static let modelIdentifierToName: [String: String] = loadModelIdentifierToName()
private static let resourceBundle: Bundle? = locateResourceBundle()
private static let resourceSubdirectory = "DeviceModels"
static func presentation(deviceFamily: String?, modelIdentifier: String?) -> DevicePresentation? {
let family = (deviceFamily ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
@ -104,13 +106,11 @@ enum DeviceModelCatalog {
}
private static func loadMapping(resourceName: String) -> [String: String] {
guard let url = self.resourceURL(
resourceName: resourceName,
guard let url = self.resourceBundle?.url(
forResource: resourceName,
withExtension: "json",
subdirectory: "DeviceModels")
else {
return [:]
}
subdirectory: self.resourceSubdirectory)
else { return [:] }
do {
let data = try Data(contentsOf: url)
@ -121,37 +121,48 @@ enum DeviceModelCatalog {
}
}
private static func resourceURL(
resourceName: String,
withExtension ext: String,
subdirectory: String
) -> URL? {
let bundledSubdir = "Clawdis_Clawdis.bundle/\(subdirectory)"
let mainBundle = Bundle.main
if let url = mainBundle.url(forResource: resourceName, withExtension: ext, subdirectory: bundledSubdir)
?? mainBundle.url(forResource: resourceName, withExtension: ext, subdirectory: subdirectory)
{
return url
private static func locateResourceBundle() -> Bundle? {
if let bundle = self.bundleIfContainsDeviceModels(Bundle.module) {
return bundle
}
let fallbackBases = [
mainBundle.resourceURL,
mainBundle.bundleURL.appendingPathComponent("Contents/Resources"),
mainBundle.bundleURL.deletingLastPathComponent(),
].compactMap { $0 }
if let bundle = self.bundleIfContainsDeviceModels(Bundle.main) {
return bundle
}
let fileName = "\(resourceName).\(ext)"
for base in fallbackBases {
let bundled = base.appendingPathComponent(bundledSubdir).appendingPathComponent(fileName)
if FileManager.default.fileExists(atPath: bundled.path) { return bundled }
let loose = base.appendingPathComponent(subdirectory).appendingPathComponent(fileName)
if FileManager.default.fileExists(atPath: loose.path) { return loose }
if let resourceURL = Bundle.main.resourceURL {
if let enumerator = FileManager.default.enumerator(
at: resourceURL,
includingPropertiesForKeys: [.isDirectoryKey],
options: [.skipsHiddenFiles]) {
for case let url as URL in enumerator {
guard url.pathExtension == "bundle" else { continue }
if let bundle = Bundle(url: url),
self.bundleIfContainsDeviceModels(bundle) != nil {
return bundle
}
}
}
}
return nil
}
private static func bundleIfContainsDeviceModels(_ bundle: Bundle) -> Bundle? {
if bundle.url(
forResource: "ios-device-identifiers",
withExtension: "json",
subdirectory: self.resourceSubdirectory) != nil {
return bundle
}
if bundle.url(
forResource: "mac-device-identifiers",
withExtension: "json",
subdirectory: self.resourceSubdirectory) != nil {
return bundle
}
return nil
}
private enum NameValue: Decodable {
case string(String)
case stringArray([String])

View File

@ -1,5 +1,4 @@
import AppKit
import OSLog
/// Central manager for Dock icon visibility.
/// Shows the Dock icon while any windows are visible, regardless of user preference.

View File

@ -51,6 +51,7 @@ actor GatewayConnection {
case providersStatus = "providers.status"
case configGet = "config.get"
case configSet = "config.set"
case talkMode = "talk.mode"
case webLoginStart = "web.login.start"
case webLoginWait = "web.login.wait"
case webLogout = "web.logout"
@ -472,7 +473,10 @@ extension GatewayConnection {
params["attachments"] = AnyCodable(encoded)
}
return try await self.requestDecoded(method: .chatSend, params: params)
return try await self.requestDecoded(
method: .chatSend,
params: params,
timeoutMs: Double(timeoutMs))
}
func chatAbort(sessionKey: String, runId: String) async throws -> Bool {
@ -483,6 +487,12 @@ extension GatewayConnection {
return res.aborted ?? false
}
func talkMode(enabled: Bool, phase: String? = nil) async {
var params: [String: AnyCodable] = ["enabled": AnyCodable(enabled)]
if let phase { params["phase"] = AnyCodable(phase) }
try? await self.requestVoid(method: .talkMode, params: params)
}
// MARK: - VoiceWake
func voiceWakeGetTriggers() async throws -> [String] {

View File

@ -1,6 +1,7 @@
import Foundation
enum GatewayLaunchAgentManager {
private static let logger = Logger(subsystem: "com.steipete.clawdis", category: "gateway.launchd")
private static let supportedBindModes: Set<String> = ["loopback", "tailnet", "lan", "auto"]
private static var plistURL: URL {
@ -26,12 +27,16 @@ enum GatewayLaunchAgentManager {
if enabled {
let gatewayBin = self.gatewayExecutablePath(bundlePath: bundlePath)
guard FileManager.default.isExecutableFile(atPath: gatewayBin) else {
self.logger.error("launchd enable failed: gateway missing at \(gatewayBin)")
return "Embedded gateway missing in bundle; rebuild via scripts/package-mac-app.sh"
}
self.logger.info("launchd enable requested port=\(port)")
self.writePlist(bundlePath: bundlePath, port: port)
_ = await self.runLaunchctl(["bootout", "gui/\(getuid())/\(gatewayLaunchdLabel)"])
let bootstrap = await self.runLaunchctl(["bootstrap", "gui/\(getuid())", self.plistURL.path])
if bootstrap.status != 0 {
let msg = bootstrap.output.trimmingCharacters(in: .whitespacesAndNewlines)
self.logger.error("launchd bootstrap failed: \(msg)")
return bootstrap.output.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty
? "Failed to bootstrap gateway launchd job"
: bootstrap.output.trimmingCharacters(in: .whitespacesAndNewlines)
@ -42,6 +47,7 @@ enum GatewayLaunchAgentManager {
return nil
}
self.logger.info("launchd disable requested")
_ = await self.runLaunchctl(["bootout", "gui/\(getuid())/\(gatewayLaunchdLabel)"])
try? FileManager.default.removeItem(at: self.plistURL)
return nil
@ -103,7 +109,11 @@ enum GatewayLaunchAgentManager {
</dict>
</plist>
"""
try? plist.write(to: self.plistURL, atomically: true, encoding: .utf8)
do {
try plist.write(to: self.plistURL, atomically: true, encoding: .utf8)
} catch {
self.logger.error("launchd plist write failed: \(error.localizedDescription)")
}
}
private static func preferredGatewayBind() -> String? {

View File

@ -42,6 +42,7 @@ final class GatewayProcessManager {
private var environmentRefreshTask: Task<Void, Never>?
private var lastEnvironmentRefresh: Date?
private var logRefreshTask: Task<Void, Never>?
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "gateway.process")
private let logLimit = 20000 // characters to keep in-memory
private let environmentRefreshMinInterval: TimeInterval = 30
@ -53,8 +54,10 @@ final class GatewayProcessManager {
self.stop()
self.status = .stopped
self.appendLog("[gateway] remote mode active; skipping local gateway\n")
self.logger.info("gateway process skipped: remote mode active")
return
}
self.logger.debug("gateway active requested active=\(active)")
self.desiredActive = active
self.refreshEnvironmentStatus()
if active {
@ -86,6 +89,7 @@ final class GatewayProcessManager {
return
}
self.status = .starting
self.logger.debug("gateway start requested")
// First try to latch onto an already-running gateway to avoid spawning a duplicate.
Task { [weak self] in
@ -98,6 +102,7 @@ final class GatewayProcessManager {
await MainActor.run {
self.status = .failed("Attach-only enabled; no gateway to attach")
self.appendLog("[gateway] attach-only enabled; not spawning local gateway\n")
self.logger.warning("gateway attach-only enabled; not spawning")
}
return
}
@ -110,6 +115,7 @@ final class GatewayProcessManager {
self.existingGatewayDetails = nil
self.lastFailureReason = nil
self.status = .stopped
self.logger.info("gateway stop requested")
let bundlePath = Bundle.main.bundleURL.path
Task {
_ = await GatewayLaunchAgentManager.set(
@ -182,6 +188,7 @@ final class GatewayProcessManager {
self.existingGatewayDetails = details
self.status = .attachedExisting(details: details)
self.appendLog("[gateway] using existing instance: \(details)\n")
self.logger.info("gateway using existing instance details=\(details)")
self.refreshControlChannelIfNeeded(reason: "attach existing")
self.refreshLog()
return true
@ -197,6 +204,7 @@ final class GatewayProcessManager {
self.status = .failed(reason)
self.lastFailureReason = reason
self.appendLog("[gateway] existing listener on port \(port) but attach failed: \(reason)\n")
self.logger.warning("gateway attach failed reason=\(reason)")
return true
}
@ -268,16 +276,19 @@ final class GatewayProcessManager {
await MainActor.run {
self.status = .failed(resolution.status.message)
}
self.logger.error("gateway command resolve failed: \(resolution.status.message)")
return
}
let bundlePath = Bundle.main.bundleURL.path
let port = GatewayEnvironment.gatewayPort()
self.appendLog("[gateway] enabling launchd job (\(gatewayLaunchdLabel)) on port \(port)\n")
self.logger.info("gateway enabling launchd port=\(port)")
let err = await GatewayLaunchAgentManager.set(enabled: true, bundlePath: bundlePath, port: port)
if let err {
self.status = .failed(err)
self.lastFailureReason = err
self.logger.error("gateway launchd enable failed: \(err)")
return
}
@ -290,6 +301,7 @@ final class GatewayProcessManager {
let instance = await PortGuardian.shared.describe(port: port)
let details = instance.map { "pid \($0.pid)" }
self.status = .running(details: details)
self.logger.info("gateway started details=\(details ?? "ok")")
self.refreshControlChannelIfNeeded(reason: "gateway started")
self.refreshLog()
return
@ -300,6 +312,7 @@ final class GatewayProcessManager {
self.status = .failed("Gateway did not start in time")
self.lastFailureReason = "launchd start timeout"
self.logger.warning("gateway start timed out")
}
private func appendLog(_ chunk: String) {
@ -317,6 +330,7 @@ final class GatewayProcessManager {
break
}
self.appendLog("[gateway] refreshing control channel (\(reason))\n")
self.logger.debug("gateway control channel refresh reason=\(reason)")
Task { await ControlChannel.shared.configure() }
}
@ -332,12 +346,14 @@ final class GatewayProcessManager {
}
}
self.appendLog("[gateway] readiness wait timed out\n")
self.logger.warning("gateway readiness wait timed out")
return false
}
func clearLog() {
self.log = ""
try? FileManager.default.removeItem(atPath: LogLocator.launchdGatewayLogPath)
self.logger.debug("gateway log cleared")
}
func setProjectRoot(path: String) {

View File

@ -1,7 +1,6 @@
import Foundation
import Network
import Observation
import OSLog
import SwiftUI
struct HealthSnapshot: Codable, Sendable {

View File

@ -0,0 +1,229 @@
@_exported import Logging
import Foundation
import OSLog
typealias Logger = Logging.Logger
enum AppLogSettings {
static let logLevelKey = appLogLevelKey
static func logLevel() -> Logger.Level {
if let raw = UserDefaults.standard.string(forKey: self.logLevelKey),
let level = Logger.Level(rawValue: raw)
{
return level
}
return .info
}
static func setLogLevel(_ level: Logger.Level) {
UserDefaults.standard.set(level.rawValue, forKey: self.logLevelKey)
}
static func fileLoggingEnabled() -> Bool {
UserDefaults.standard.bool(forKey: debugFileLogEnabledKey)
}
}
enum AppLogLevel: String, CaseIterable, Identifiable {
case trace
case debug
case info
case notice
case warning
case error
case critical
static let `default`: AppLogLevel = .info
var id: String { self.rawValue }
var title: String {
switch self {
case .trace: "Trace"
case .debug: "Debug"
case .info: "Info"
case .notice: "Notice"
case .warning: "Warning"
case .error: "Error"
case .critical: "Critical"
}
}
}
enum ClawdisLogging {
private static let labelSeparator = "::"
private static let didBootstrap: Void = {
LoggingSystem.bootstrap { label in
let (subsystem, category) = Self.parseLabel(label)
let osHandler = ClawdisOSLogHandler(subsystem: subsystem, category: category)
let fileHandler = ClawdisFileLogHandler(label: label)
return MultiplexLogHandler([osHandler, fileHandler])
}
}()
static func bootstrapIfNeeded() {
_ = Self.didBootstrap
}
static func makeLabel(subsystem: String, category: String) -> String {
"\(subsystem)\(Self.labelSeparator)\(category)"
}
static func parseLabel(_ label: String) -> (String, String) {
guard let range = label.range(of: Self.labelSeparator) else {
return ("com.steipete.clawdis", label)
}
let subsystem = String(label[..<range.lowerBound])
let category = String(label[range.upperBound...])
return (subsystem, category)
}
}
extension Logging.Logger {
init(subsystem: String, category: String) {
ClawdisLogging.bootstrapIfNeeded()
let label = ClawdisLogging.makeLabel(subsystem: subsystem, category: category)
self.init(label: label)
}
}
extension Logger.Message.StringInterpolation {
mutating func appendInterpolation<T>(_ value: T, privacy: OSLogPrivacy) {
self.appendInterpolation(String(describing: value))
}
}
struct ClawdisOSLogHandler: LogHandler {
private let osLogger: OSLog.Logger
var metadata: Logger.Metadata = [:]
var logLevel: Logger.Level {
get { AppLogSettings.logLevel() }
set { AppLogSettings.setLogLevel(newValue) }
}
init(subsystem: String, category: String) {
self.osLogger = OSLog.Logger(subsystem: subsystem, category: category)
}
subscript(metadataKey key: String) -> Logger.Metadata.Value? {
get { self.metadata[key] }
set { self.metadata[key] = newValue }
}
func log(
level: Logger.Level,
message: Logger.Message,
metadata: Logger.Metadata?,
source: String,
file: String,
function: String,
line: UInt)
{
let merged = Self.mergeMetadata(self.metadata, metadata)
let rendered = Self.renderMessage(message, metadata: merged)
self.osLogger.log(level: Self.osLogType(for: level), "\(rendered, privacy: .public)")
}
private static func osLogType(for level: Logger.Level) -> OSLogType {
switch level {
case .trace, .debug:
return .debug
case .info, .notice:
return .info
case .warning:
return .default
case .error:
return .error
case .critical:
return .fault
}
}
private static func mergeMetadata(
_ base: Logger.Metadata,
_ extra: Logger.Metadata?) -> Logger.Metadata
{
guard let extra else { return base }
return base.merging(extra, uniquingKeysWith: { _, new in new })
}
private static func renderMessage(_ message: Logger.Message, metadata: Logger.Metadata) -> String {
guard !metadata.isEmpty else { return message.description }
let meta = metadata
.sorted(by: { $0.key < $1.key })
.map { "\($0.key)=\(stringify($0.value))" }
.joined(separator: " ")
return "\(message.description) [\(meta)]"
}
private static func stringify(_ value: Logger.Metadata.Value) -> String {
switch value {
case let .string(text):
text
case let .stringConvertible(value):
String(describing: value)
case let .array(values):
"[" + values.map { stringify($0) }.joined(separator: ",") + "]"
case let .dictionary(entries):
"{" + entries.map { "\($0.key)=\(stringify($0.value))" }.joined(separator: ",") + "}"
}
}
}
struct ClawdisFileLogHandler: LogHandler {
let label: String
var metadata: Logger.Metadata = [:]
var logLevel: Logger.Level {
get { AppLogSettings.logLevel() }
set { AppLogSettings.setLogLevel(newValue) }
}
subscript(metadataKey key: String) -> Logger.Metadata.Value? {
get { self.metadata[key] }
set { self.metadata[key] = newValue }
}
func log(
level: Logger.Level,
message: Logger.Message,
metadata: Logger.Metadata?,
source: String,
file: String,
function: String,
line: UInt)
{
guard AppLogSettings.fileLoggingEnabled() else { return }
let (subsystem, category) = ClawdisLogging.parseLabel(self.label)
var fields: [String: String] = [
"subsystem": subsystem,
"category": category,
"level": level.rawValue,
"source": source,
"file": file,
"function": function,
"line": "\(line)",
]
let merged = self.metadata.merging(metadata ?? [:], uniquingKeysWith: { _, new in new })
for (key, value) in merged {
fields["meta.\(key)"] = Self.stringify(value)
}
DiagnosticsFileLog.shared.log(category: category, event: message.description, fields: fields)
}
private static func stringify(_ value: Logger.Metadata.Value) -> String {
switch value {
case let .string(text):
text
case let .stringConvertible(value):
String(describing: value)
case let .array(values):
"[" + values.map { stringify($0) }.joined(separator: ",") + "]"
case let .dictionary(entries):
"{" + entries.map { "\($0.key)=\(stringify($0.value))" }.joined(separator: ",") + "}"
}
}
}

View File

@ -3,7 +3,6 @@ import Darwin
import Foundation
import MenuBarExtraAccess
import Observation
import OSLog
import Security
import SwiftUI
@ -30,6 +29,7 @@ struct ClawdisApp: App {
}
init() {
ClawdisLogging.bootstrapIfNeeded()
_state = State(initialValue: AppStateStore.shared)
}

View File

@ -14,11 +14,14 @@ struct MenuContent: View {
private let heartbeatStore = HeartbeatStore.shared
private let controlChannel = ControlChannel.shared
private let activityStore = WorkActivityStore.shared
@Bindable private var pairingPrompter = NodePairingApprovalPrompter.shared
@Environment(\.openSettings) private var openSettings
@State private var availableMics: [AudioInputDevice] = []
@State private var loadingMics = false
@State private var browserControlEnabled = true
@AppStorage(cameraEnabledKey) private var cameraEnabled: Bool = false
@AppStorage(appLogLevelKey) private var appLogLevelRaw: String = AppLogLevel.default.rawValue
@AppStorage(debugFileLogEnabledKey) private var appFileLoggingEnabled: Bool = false
init(state: AppState, updater: UpdaterProviding?) {
self._state = Bindable(wrappedValue: state)
@ -32,6 +35,13 @@ struct MenuContent: View {
VStack(alignment: .leading, spacing: 2) {
Text(self.connectionLabel)
self.statusLine(label: self.healthStatus.label, color: self.healthStatus.color)
if self.pairingPrompter.pendingCount > 0 {
let repairCount = self.pairingPrompter.pendingRepairCount
let repairSuffix = repairCount > 0 ? " · \(repairCount) repair" : ""
self.statusLine(
label: "Pairing approval pending (\(self.pairingPrompter.pendingCount))\(repairSuffix)",
color: .orange)
}
}
}
.disabled(self.state.connectionMode == .unconfigured)
@ -102,6 +112,13 @@ struct MenuContent: View {
systemImage: "rectangle.inset.filled.on.rectangle")
}
}
Button {
Task { await self.state.setTalkEnabled(!self.state.talkEnabled) }
} label: {
Label(self.state.talkEnabled ? "Stop Talk Mode" : "Talk Mode", systemImage: "waveform.circle.fill")
}
.disabled(!voiceWakeSupported)
.opacity(voiceWakeSupported ? 1 : 0.5)
Divider()
Button("Settings…") { self.open(tab: .general) }
.keyboardShortcut(",", modifiers: [.command])
@ -167,6 +184,20 @@ struct MenuContent: View {
: "Verbose Logging (Main): Off",
systemImage: "text.alignleft")
}
Menu("App Logging") {
Picker("Verbosity", selection: self.$appLogLevelRaw) {
ForEach(AppLogLevel.allCases) { level in
Text(level.title).tag(level.rawValue)
}
}
Toggle(isOn: self.$appFileLoggingEnabled) {
Label(
self.appFileLoggingEnabled
? "File Logging: On"
: "File Logging: Off",
systemImage: "doc.text.magnifyingglass")
}
}
Button {
DebugActions.openSessionStore()
} label: {
@ -194,10 +225,12 @@ struct MenuContent: View {
Label("Send Test Notification", systemImage: "bell")
}
Divider()
Button {
DebugActions.restartGateway()
} label: {
Label("Restart Gateway", systemImage: "arrow.clockwise")
if self.state.connectionMode == .local, !AppStateStore.attachExistingGatewayOnly {
Button {
DebugActions.restartGateway()
} label: {
Label("Restart Gateway", systemImage: "arrow.clockwise")
}
}
Button {
DebugActions.restartApp()

View File

@ -22,8 +22,7 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
private var cachedErrorText: String?
private var cacheUpdatedAt: Date?
private let refreshIntervalSeconds: TimeInterval = 12
private let nodesStore = InstancesStore.shared
private let gatewayDiscovery = GatewayDiscoveryModel()
private let nodesStore = NodesStore.shared
#if DEBUG
private var testControlChannelConnected: Bool?
#endif
@ -43,7 +42,6 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
}
self.nodesStore.start()
self.gatewayDiscovery.start()
}
func menuWillOpen(_ menu: NSMenu) {
@ -218,7 +216,7 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
}
if entries.isEmpty {
let title = self.nodesStore.isLoading ? "Loading nodes..." : "No nodes yet"
let title = self.nodesStore.isLoading ? "Loading devices..." : "No devices yet"
menu.insertItem(self.makeMessageItem(text: title, symbolName: "circle.dashed", width: width), at: cursor)
cursor += 1
} else {
@ -231,7 +229,7 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
item.view = HighlightedMenuItemHostView(
rootView: AnyView(NodeMenuRowView(entry: entry, width: width)),
width: width)
item.submenu = self.buildNodeSubmenu(entry: entry)
item.submenu = self.buildNodeSubmenu(entry: entry, width: width)
menu.insertItem(item, at: cursor)
cursor += 1
}
@ -239,7 +237,7 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
if entries.count > 8 {
let moreItem = NSMenuItem()
moreItem.tag = self.nodesTag
moreItem.title = "More Nodes..."
moreItem.title = "More Devices..."
moreItem.image = NSImage(systemSymbolName: "ellipsis.circle", accessibilityDescription: nil)
let overflow = Array(entries.dropFirst(8))
moreItem.submenu = self.buildNodesOverflowMenu(entries: overflow, width: width)
@ -436,7 +434,7 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
return menu
}
private func buildNodesOverflowMenu(entries: [InstanceInfo], width: CGFloat) -> NSMenu {
private func buildNodesOverflowMenu(entries: [NodeInfo], width: CGFloat) -> NSMenu {
let menu = NSMenu()
for entry in entries {
let item = NSMenuItem()
@ -446,27 +444,27 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
item.view = HighlightedMenuItemHostView(
rootView: AnyView(NodeMenuRowView(entry: entry, width: width)),
width: width)
item.submenu = self.buildNodeSubmenu(entry: entry)
item.submenu = self.buildNodeSubmenu(entry: entry, width: width)
menu.addItem(item)
}
return menu
}
private func buildNodeSubmenu(entry: InstanceInfo) -> NSMenu {
private func buildNodeSubmenu(entry: NodeInfo, width: CGFloat) -> NSMenu {
let menu = NSMenu()
menu.autoenablesItems = false
menu.addItem(self.makeNodeCopyItem(label: "ID", value: entry.id))
menu.addItem(self.makeNodeCopyItem(label: "Node ID", value: entry.nodeId))
if let host = entry.host?.nonEmpty {
menu.addItem(self.makeNodeCopyItem(label: "Host", value: host))
if let name = entry.displayName?.nonEmpty {
menu.addItem(self.makeNodeCopyItem(label: "Name", value: name))
}
if let ip = entry.ip?.nonEmpty {
if let ip = entry.remoteIp?.nonEmpty {
menu.addItem(self.makeNodeCopyItem(label: "IP", value: ip))
}
menu.addItem(self.makeNodeCopyItem(label: "Role", value: NodeMenuEntryFormatter.roleText(entry)))
menu.addItem(self.makeNodeCopyItem(label: "Status", value: NodeMenuEntryFormatter.roleText(entry)))
if let platform = NodeMenuEntryFormatter.platformText(entry) {
menu.addItem(self.makeNodeCopyItem(label: "Platform", value: platform))
@ -476,19 +474,20 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
menu.addItem(self.makeNodeCopyItem(label: "Version", value: self.formatVersionLabel(version)))
}
menu.addItem(self.makeNodeDetailItem(label: "Last seen", value: entry.ageDescription))
menu.addItem(self.makeNodeDetailItem(label: "Connected", value: entry.isConnected ? "Yes" : "No"))
menu.addItem(self.makeNodeDetailItem(label: "Paired", value: entry.isPaired ? "Yes" : "No"))
if entry.lastInputSeconds != nil {
menu.addItem(self.makeNodeDetailItem(label: "Last input", value: entry.lastInputDescription))
if let caps = entry.caps?.filter({ !$0.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty }),
!caps.isEmpty {
menu.addItem(self.makeNodeCopyItem(label: "Caps", value: caps.joined(separator: ", ")))
}
if let reason = entry.reason?.nonEmpty {
menu.addItem(self.makeNodeDetailItem(label: "Reason", value: reason))
}
if let sshURL = self.sshURL(for: entry) {
menu.addItem(.separator())
menu.addItem(self.makeNodeActionItem(title: "Open SSH", url: sshURL))
if let commands = entry.commands?.filter({ !$0.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty }),
!commands.isEmpty {
menu.addItem(self.makeNodeMultilineItem(
label: "Commands",
value: commands.joined(separator: ", "),
width: width))
}
return menu
@ -507,12 +506,17 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
return item
}
private func makeNodeActionItem(title: String, url: URL) -> NSMenuItem {
let item = NSMenuItem(title: title, action: #selector(self.openNodeSSH(_:)), keyEquivalent: "")
private func makeNodeMultilineItem(label: String, value: String, width: CGFloat) -> NSMenuItem {
let item = NSMenuItem()
item.target = self
item.representedObject = url
item.action = #selector(self.copyNodeValue(_:))
item.representedObject = value
item.view = HighlightedMenuItemHostView(
rootView: AnyView(NodeMenuMultilineView(label: label, value: value, width: width)),
width: width)
return item
}
private func formatVersionLabel(_ version: String) -> String {
let trimmed = version.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return version }
@ -638,104 +642,6 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
NSPasteboard.general.setString(value, forType: .string)
}
@objc
private func openNodeSSH(_ sender: NSMenuItem) {
guard let url = sender.representedObject as? URL else { return }
if let appURL = self.preferredTerminalAppURL() {
NSWorkspace.shared.open(
[url],
withApplicationAt: appURL,
configuration: NSWorkspace.OpenConfiguration(),
completionHandler: nil)
} else {
NSWorkspace.shared.open(url)
}
}
private func preferredTerminalAppURL() -> URL? {
if let ghosty = self.ghostyAppURL() { return ghosty }
return NSWorkspace.shared.urlForApplication(withBundleIdentifier: "com.apple.Terminal")
}
private func ghostyAppURL() -> URL? {
let candidates = [
"/Applications/Ghosty.app",
("~/Applications/Ghosty.app" as NSString).expandingTildeInPath,
]
for path in candidates where FileManager.default.fileExists(atPath: path) {
return URL(fileURLWithPath: path)
}
return nil
}
private func sshURL(for entry: InstanceInfo) -> URL? {
guard NodeMenuEntryFormatter.isGateway(entry) else { return nil }
guard let gateway = self.matchingGateway(for: entry) else { return nil }
guard let host = self.sanitizedTailnetHost(gateway.tailnetDns) ?? gateway.lanHost else { return nil }
let user = NSUserName()
return self.buildSSHURL(user: user, host: host, port: gateway.sshPort)
}
private func matchingGateway(for entry: InstanceInfo) -> GatewayDiscoveryModel.DiscoveredGateway? {
let candidates = self.entryHostCandidates(entry)
guard !candidates.isEmpty else { return nil }
return self.gatewayDiscovery.gateways.first { gateway in
let gatewayTokens = self.gatewayHostTokens(gateway)
return candidates.contains { gatewayTokens.contains($0) }
}
}
private func entryHostCandidates(_ entry: InstanceInfo) -> [String] {
let raw: [String?] = [
entry.host,
entry.ip,
NodeMenuEntryFormatter.primaryName(entry),
]
return raw.compactMap(self.normalizedHostToken(_:))
}
private func gatewayHostTokens(_ gateway: GatewayDiscoveryModel.DiscoveredGateway) -> [String] {
let raw: [String?] = [
gateway.displayName,
gateway.lanHost,
gateway.tailnetDns,
]
return raw.compactMap(self.normalizedHostToken(_:))
}
private func normalizedHostToken(_ value: String?) -> String? {
guard let value else { return nil }
let trimmed = value.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty { return nil }
let lower = trimmed.lowercased().trimmingCharacters(in: CharacterSet(charactersIn: "."))
if lower.hasSuffix(".localdomain") {
return lower.replacingOccurrences(of: ".localdomain", with: ".local")
}
return lower
}
private func sanitizedTailnetHost(_ host: String?) -> String? {
guard let host else { return nil }
let trimmed = host.trimmingCharacters(in: .whitespacesAndNewlines)
if trimmed.isEmpty { return nil }
if trimmed.hasSuffix(".internal.") || trimmed.hasSuffix(".internal") {
return nil
}
return trimmed
}
private func buildSSHURL(user: String, host: String, port: Int) -> URL? {
var components = URLComponents()
components.scheme = "ssh"
components.user = user
components.host = host
if port != 22 {
components.port = port
}
return components.url
}
// MARK: - Width + placement
private func findInsertIndex(in menu: NSMenu) -> Int? {
@ -790,23 +696,14 @@ final class MenuSessionsInjector: NSObject, NSMenuDelegate {
return width
}
private func sortedNodeEntries() -> [InstanceInfo] {
let entries = self.nodesStore.instances.filter { entry in
let mode = entry.mode?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
return mode != "health"
}
private func sortedNodeEntries() -> [NodeInfo] {
let entries = self.nodesStore.nodes.filter { $0.isConnected }
return entries.sorted { lhs, rhs in
let lhsGateway = NodeMenuEntryFormatter.isGateway(lhs)
let rhsGateway = NodeMenuEntryFormatter.isGateway(rhs)
if lhsGateway != rhsGateway { return lhsGateway }
let lhsLocal = NodeMenuEntryFormatter.isLocal(lhs)
let rhsLocal = NodeMenuEntryFormatter.isLocal(rhs)
if lhsLocal != rhsLocal { return lhsLocal }
if lhs.isConnected != rhs.isConnected { return lhs.isConnected }
if lhs.isPaired != rhs.isPaired { return lhs.isPaired }
let lhsName = NodeMenuEntryFormatter.primaryName(lhs).lowercased()
let rhsName = NodeMenuEntryFormatter.primaryName(rhs).lowercased()
if lhsName == rhsName { return lhs.ts > rhs.ts }
if lhsName == rhsName { return lhs.nodeId < rhs.nodeId }
return lhsName < rhsName
}
}

View File

@ -4,18 +4,23 @@ import JavaScriptCore
enum ModelCatalogLoader {
static let defaultPath: String = FileManager.default.homeDirectoryForCurrentUser
.appendingPathComponent("Projects/pi-mono/packages/ai/src/models.generated.ts").path
private static let logger = Logger(subsystem: "com.steipete.clawdis", category: "models")
static func load(from path: String) async throws -> [ModelChoice] {
let expanded = (path as NSString).expandingTildeInPath
self.logger.debug("model catalog load start file=\(URL(fileURLWithPath: expanded).lastPathComponent)")
let source = try String(contentsOfFile: expanded, encoding: .utf8)
let sanitized = self.sanitize(source: source)
let ctx = JSContext()
ctx?.exceptionHandler = { _, exception in
if let exception { print("JS exception: \(exception)") }
if let exception {
self.logger.warning("model catalog JS exception: \(exception)")
}
}
ctx?.evaluateScript(sanitized)
guard let rawModels = ctx?.objectForKeyedSubscript("MODELS")?.toDictionary() as? [String: Any] else {
self.logger.error("model catalog parse failed: MODELS missing")
throw NSError(
domain: "ModelCatalogLoader",
code: 1,
@ -33,12 +38,14 @@ enum ModelCatalogLoader {
}
}
return choices.sorted { lhs, rhs in
let sorted = choices.sorted { lhs, rhs in
if lhs.provider == rhs.provider {
return lhs.name.localizedCaseInsensitiveCompare(rhs.name) == .orderedAscending
}
return lhs.provider.localizedCaseInsensitiveCompare(rhs.provider) == .orderedAscending
}
self.logger.debug("model catalog loaded providers=\(rawModels.count) models=\(sorted.count)")
return sorted
}
private static func sanitize(source: String) -> String {

View File

@ -265,7 +265,7 @@ actor MacNodeRuntime {
guard let raw = await GatewayConnection.shared.canvasHostUrl() else { return nil }
let trimmed = raw.trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty, let baseUrl = URL(string: trimmed) else { return nil }
return baseUrl.appendingPathComponent("__clawdis__/a2ui/").absoluteString
return baseUrl.appendingPathComponent("__clawdis__/a2ui/").absoluteString + "?platform=macos"
}
private func isA2UIReady(poll: Bool = false) async -> Bool {

View File

@ -2,6 +2,7 @@ import AppKit
import ClawdisIPC
import ClawdisProtocol
import Foundation
import Observation
import OSLog
import UserNotifications
@ -15,6 +16,7 @@ enum NodePairingReconcilePolicy {
}
@MainActor
@Observable
final class NodePairingApprovalPrompter {
static let shared = NodePairingApprovalPrompter()
@ -26,6 +28,8 @@ final class NodePairingApprovalPrompter {
private var isStopping = false
private var isPresenting = false
private var queue: [PendingRequest] = []
var pendingCount: Int = 0
var pendingRepairCount: Int = 0
private var activeAlert: NSAlert?
private var activeRequestId: String?
private var alertHostWindow: NSWindow?
@ -104,6 +108,7 @@ final class NodePairingApprovalPrompter {
self.reconcileOnceTask?.cancel()
self.reconcileOnceTask = nil
self.queue.removeAll(keepingCapacity: false)
self.updatePendingCounts()
self.isPresenting = false
self.activeRequestId = nil
self.alertHostWindow?.orderOut(nil)
@ -292,6 +297,7 @@ final class NodePairingApprovalPrompter {
private func enqueue(_ req: PendingRequest) {
if self.queue.contains(req) { return }
self.queue.append(req)
self.updatePendingCounts()
self.presentNextIfNeeded()
self.updateReconcileLoop()
}
@ -362,6 +368,7 @@ final class NodePairingApprovalPrompter {
} else {
self.queue.removeAll { $0 == request }
}
self.updatePendingCounts()
self.isPresenting = false
self.presentNextIfNeeded()
self.updateReconcileLoop()
@ -501,6 +508,8 @@ final class NodePairingApprovalPrompter {
} else {
self.queue.removeAll { $0 == req }
}
self.updatePendingCounts()
self.isPresenting = false
self.presentNextIfNeeded()
self.updateReconcileLoop()
@ -599,6 +608,12 @@ final class NodePairingApprovalPrompter {
}
}
private func updatePendingCounts() {
// Keep a cheap observable summary for the menu bar status line.
self.pendingCount = self.queue.count
self.pendingRepairCount = self.queue.filter { $0.isRepair == true }.count
}
private func reconcileOnce(timeoutMs: Double) async {
if self.isStopping { return }
if self.reconcileInFlight { return }
@ -643,6 +658,7 @@ final class NodePairingApprovalPrompter {
return
}
self.queue.removeAll { $0.requestId == resolved.requestId }
self.updatePendingCounts()
Task { @MainActor in
await self.notify(resolution: resolution, request: request, via: "remote")
}

View File

@ -2,51 +2,53 @@ import AppKit
import SwiftUI
struct NodeMenuEntryFormatter {
static func isGateway(_ entry: InstanceInfo) -> Bool {
entry.mode?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased() == "gateway"
static func isConnected(_ entry: NodeInfo) -> Bool {
entry.isConnected
}
static func isLocal(_ entry: InstanceInfo) -> Bool {
entry.mode?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased() == "local"
static func primaryName(_ entry: NodeInfo) -> String {
entry.displayName?.nonEmpty ?? entry.nodeId
}
static func primaryName(_ entry: InstanceInfo) -> String {
if self.isGateway(entry) {
let host = entry.host?.nonEmpty
if let host, host.lowercased() != "gateway" { return host }
return "Gateway"
static func summaryText(_ entry: NodeInfo) -> String {
let name = self.primaryName(entry)
var prefix = "Node: \(name)"
if let ip = entry.remoteIp?.nonEmpty {
prefix += " (\(ip))"
}
var parts = [prefix]
if let platform = self.platformText(entry) {
parts.append("platform \(platform)")
}
return entry.host?.nonEmpty ?? entry.id
}
static func summaryText(_ entry: InstanceInfo) -> String {
entry.text.nonEmpty ?? self.primaryName(entry)
}
static func roleText(_ entry: InstanceInfo) -> String {
if self.isGateway(entry) { return "gateway" }
if let mode = entry.mode?.nonEmpty { return mode }
return "node"
}
static func detailLeft(_ entry: InstanceInfo) -> String {
let role = self.roleText(entry)
if let ip = entry.ip?.nonEmpty { return "\(ip) · \(role)" }
return role
}
static func detailRight(_ entry: InstanceInfo) -> String? {
var parts: [String] = []
if let platform = self.platformText(entry) { parts.append(platform) }
if let version = entry.version?.nonEmpty {
let short = self.compactVersion(version)
parts.append("v\(short)")
parts.append("app \(self.compactVersion(version))")
}
if parts.isEmpty { return nil }
parts.append("status \(self.roleText(entry))")
return parts.joined(separator: " · ")
}
static func platformText(_ entry: InstanceInfo) -> String? {
static func roleText(_ entry: NodeInfo) -> String {
if entry.isConnected { return "connected" }
if entry.isPaired { return "paired" }
return "unpaired"
}
static func detailLeft(_ entry: NodeInfo) -> String {
let role = self.roleText(entry)
if let ip = entry.remoteIp?.nonEmpty { return "\(ip) · \(role)" }
return role
}
static func headlineRight(_ entry: NodeInfo) -> String? {
self.platformText(entry)
}
static func detailRightVersion(_ entry: NodeInfo) -> String? {
guard let version = entry.version?.nonEmpty else { return nil }
return self.shortVersionLabel(version)
}
static func platformText(_ entry: NodeInfo) -> String? {
if let raw = entry.platform?.nonEmpty {
return self.prettyPlatform(raw) ?? raw
}
@ -99,8 +101,17 @@ struct NodeMenuEntryFormatter {
return trimmed
}
static func leadingSymbol(_ entry: InstanceInfo) -> String {
if self.isGateway(entry) { return self.safeSystemSymbol("dot.radiowaves.left.and.right", fallback: "network") }
private static func shortVersionLabel(_ raw: String) -> String {
let compact = self.compactVersion(raw)
if compact.isEmpty { return compact }
if compact.lowercased().hasPrefix("v") { return compact }
if let first = compact.unicodeScalars.first, CharacterSet.decimalDigits.contains(first) {
return "v\(compact)"
}
return compact
}
static func leadingSymbol(_ entry: NodeInfo) -> String {
if let family = entry.deviceFamily?.lowercased() {
if family.contains("mac") {
return self.safeSystemSymbol("laptopcomputer", fallback: "laptopcomputer")
@ -116,9 +127,11 @@ struct NodeMenuEntryFormatter {
return "cpu"
}
static func isAndroid(_ entry: InstanceInfo) -> Bool {
static func isAndroid(_ entry: NodeInfo) -> Bool {
let family = entry.deviceFamily?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
return family == "android"
if family == "android" { return true }
let platform = entry.platform?.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
return platform?.contains("android") == true
}
private static func safeSystemSymbol(_ preferred: String, fallback: String) -> String {
@ -128,7 +141,7 @@ struct NodeMenuEntryFormatter {
}
struct NodeMenuRowView: View {
let entry: InstanceInfo
let entry: NodeInfo
let width: CGFloat
@Environment(\.menuItemHighlighted) private var isHighlighted
@ -146,11 +159,32 @@ struct NodeMenuRowView: View {
.frame(width: 22, height: 22, alignment: .center)
VStack(alignment: .leading, spacing: 2) {
Text(NodeMenuEntryFormatter.primaryName(self.entry))
.font(.callout.weight(NodeMenuEntryFormatter.isGateway(self.entry) ? .semibold : .regular))
.foregroundStyle(self.primaryColor)
.lineLimit(1)
.truncationMode(.middle)
HStack(alignment: .firstTextBaseline, spacing: 8) {
Text(NodeMenuEntryFormatter.primaryName(self.entry))
.font(.callout.weight(NodeMenuEntryFormatter.isConnected(self.entry) ? .semibold : .regular))
.foregroundStyle(self.primaryColor)
.lineLimit(1)
.truncationMode(.middle)
.layoutPriority(1)
Spacer(minLength: 8)
HStack(alignment: .firstTextBaseline, spacing: 6) {
if let right = NodeMenuEntryFormatter.headlineRight(self.entry) {
Text(right)
.font(.caption.monospacedDigit())
.foregroundStyle(self.secondaryColor)
.lineLimit(1)
.truncationMode(.middle)
.layoutPriority(2)
}
Image(systemName: "chevron.right")
.font(.caption.weight(.semibold))
.foregroundStyle(self.secondaryColor)
.padding(.leading, 2)
}
}
HStack(alignment: .firstTextBaseline, spacing: 8) {
Text(NodeMenuEntryFormatter.detailLeft(self.entry))
@ -161,21 +195,15 @@ struct NodeMenuRowView: View {
Spacer(minLength: 0)
HStack(alignment: .firstTextBaseline, spacing: 6) {
if let right = NodeMenuEntryFormatter.detailRight(self.entry) {
Text(right)
.font(.caption.monospacedDigit())
.foregroundStyle(self.secondaryColor)
.lineLimit(1)
.truncationMode(.middle)
}
Image(systemName: "chevron.right")
.font(.caption.weight(.semibold))
if let version = NodeMenuEntryFormatter.detailRightVersion(self.entry) {
Text(version)
.font(.caption.monospacedDigit())
.foregroundStyle(self.secondaryColor)
.padding(.leading, 2)
.lineLimit(1)
.truncationMode(.middle)
}
}
.frame(maxWidth: .infinity, alignment: .leading)
}
.frame(maxWidth: .infinity, alignment: .leading)
@ -215,3 +243,36 @@ struct AndroidMark: View {
}
}
}
struct NodeMenuMultilineView: View {
let label: String
let value: String
let width: CGFloat
@Environment(\.menuItemHighlighted) private var isHighlighted
private var primaryColor: Color {
self.isHighlighted ? Color(nsColor: .selectedMenuItemTextColor) : .primary
}
private var secondaryColor: Color {
self.isHighlighted ? Color(nsColor: .selectedMenuItemTextColor).opacity(0.85) : .secondary
}
var body: some View {
VStack(alignment: .leading, spacing: 4) {
Text("\(self.label):")
.font(.caption.weight(.semibold))
.foregroundStyle(self.secondaryColor)
Text(self.value)
.font(.caption)
.foregroundStyle(self.primaryColor)
.multilineTextAlignment(.leading)
.fixedSize(horizontal: false, vertical: true)
}
.padding(.vertical, 6)
.padding(.leading, 18)
.padding(.trailing, 12)
.frame(width: max(1, self.width), alignment: .leading)
}
}

View File

@ -0,0 +1,84 @@
import Foundation
import Observation
import OSLog
struct NodeInfo: Identifiable, Codable {
let nodeId: String
let displayName: String?
let platform: String?
let version: String?
let deviceFamily: String?
let modelIdentifier: String?
let remoteIp: String?
let caps: [String]?
let commands: [String]?
let permissions: [String: Bool]?
let paired: Bool?
let connected: Bool?
var id: String { self.nodeId }
var isConnected: Bool { self.connected ?? false }
var isPaired: Bool { self.paired ?? false }
}
private struct NodeListResponse: Codable {
let ts: Double?
let nodes: [NodeInfo]
}
@MainActor
@Observable
final class NodesStore {
static let shared = NodesStore()
var nodes: [NodeInfo] = []
var lastError: String?
var statusMessage: String?
var isLoading = false
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "nodes")
private var task: Task<Void, Never>?
private let interval: TimeInterval = 30
private var startCount = 0
func start() {
self.startCount += 1
guard self.startCount == 1 else { return }
guard self.task == nil else { return }
self.task = Task.detached { [weak self] in
guard let self else { return }
await self.refresh()
while !Task.isCancelled {
try? await Task.sleep(nanoseconds: UInt64(self.interval * 1_000_000_000))
await self.refresh()
}
}
}
func stop() {
guard self.startCount > 0 else { return }
self.startCount -= 1
guard self.startCount == 0 else { return }
self.task?.cancel()
self.task = nil
}
func refresh() async {
if self.isLoading { return }
self.statusMessage = nil
self.isLoading = true
defer { self.isLoading = false }
do {
let data = try await GatewayConnection.shared.requestRaw(method: "node.list", params: nil, timeoutMs: 8000)
let decoded = try JSONDecoder().decode(NodeListResponse.self, from: data)
self.nodes = decoded.nodes
self.lastError = nil
self.statusMessage = nil
} catch {
self.logger.error("node.list failed \(error.localizedDescription, privacy: .public)")
self.nodes = []
self.lastError = error.localizedDescription
self.statusMessage = nil
}
}
}

View File

@ -5,6 +5,8 @@ import UserNotifications
@MainActor
struct NotificationManager {
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "notifications")
private static let hasTimeSensitiveEntitlement: Bool = {
guard let task = SecTaskCreateFromSelf(nil) else { return false }
let key = "com.apple.developer.usernotifications.time-sensitive" as CFString
@ -17,8 +19,12 @@ struct NotificationManager {
let status = await center.notificationSettings()
if status.authorizationStatus == .notDetermined {
let granted = try? await center.requestAuthorization(options: [.alert, .sound, .badge])
if granted != true { return false }
if granted != true {
self.logger.warning("notification permission denied (request)")
return false
}
} else if status.authorizationStatus != .authorized {
self.logger.warning("notification permission denied status=\(status.authorizationStatus.rawValue)")
return false
}
@ -37,15 +43,22 @@ struct NotificationManager {
case .active:
content.interruptionLevel = .active
case .timeSensitive:
content.interruptionLevel = Self.hasTimeSensitiveEntitlement ? .timeSensitive : .active
if Self.hasTimeSensitiveEntitlement {
content.interruptionLevel = .timeSensitive
} else {
self.logger.debug("time-sensitive notification requested without entitlement; falling back to active")
content.interruptionLevel = .active
}
}
}
let req = UNNotificationRequest(identifier: UUID().uuidString, content: content, trigger: nil)
do {
try await center.add(req)
self.logger.debug("notification queued")
return true
} catch {
self.logger.error("notification send failed: \(error.localizedDescription)")
return false
}
}

View File

@ -5,7 +5,6 @@ import ClawdisIPC
import CoreGraphics
import Foundation
import Observation
import OSLog
import Speech
import UserNotifications

View File

@ -21,7 +21,6 @@ struct SettingsRootView: View {
if self.isNixMode {
self.nixManagedBanner
}
TabView(selection: self.$selectedTab) {
GeneralSettings(state: self.state)
.tabItem { Label("General", systemImage: "gearshape") }
@ -63,7 +62,7 @@ struct SettingsRootView: View {
.tag(SettingsTab.permissions)
if self.state.debugPaneEnabled {
DebugSettings()
DebugSettings(state: self.state)
.tabItem { Label("Debug", systemImage: "ant") }
.tag(SettingsTab.debug)
}

View File

@ -0,0 +1,158 @@
import AVFoundation
import Foundation
import OSLog
@MainActor
final class TalkAudioPlayer: NSObject, @preconcurrency AVAudioPlayerDelegate {
static let shared = TalkAudioPlayer()
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "talk.tts")
private var player: AVAudioPlayer?
private var playback: Playback?
private final class Playback: @unchecked Sendable {
private let lock = NSLock()
private var finished = false
private var continuation: CheckedContinuation<TalkPlaybackResult, Never>?
private var watchdog: Task<Void, Never>?
func setContinuation(_ continuation: CheckedContinuation<TalkPlaybackResult, Never>) {
self.lock.lock()
defer { self.lock.unlock() }
self.continuation = continuation
}
func setWatchdog(_ task: Task<Void, Never>?) {
self.lock.lock()
let old = self.watchdog
self.watchdog = task
self.lock.unlock()
old?.cancel()
}
func cancelWatchdog() {
self.setWatchdog(nil)
}
func finish(_ result: TalkPlaybackResult) {
let continuation: CheckedContinuation<TalkPlaybackResult, Never>?
self.lock.lock()
if self.finished {
continuation = nil
} else {
self.finished = true
continuation = self.continuation
self.continuation = nil
}
self.lock.unlock()
continuation?.resume(returning: result)
}
}
func play(data: Data) async -> TalkPlaybackResult {
self.stopInternal()
let playback = Playback()
self.playback = playback
return await withCheckedContinuation { continuation in
playback.setContinuation(continuation)
do {
let player = try AVAudioPlayer(data: data)
self.player = player
player.delegate = self
player.prepareToPlay()
self.armWatchdog(playback: playback)
let ok = player.play()
if !ok {
self.logger.error("talk audio player refused to play")
self.finish(playback: playback, result: TalkPlaybackResult(finished: false, interruptedAt: nil))
}
} catch {
self.logger.error("talk audio player failed: \(error.localizedDescription, privacy: .public)")
self.finish(playback: playback, result: TalkPlaybackResult(finished: false, interruptedAt: nil))
}
}
}
func stop() -> Double? {
guard let player else { return nil }
let time = player.currentTime
self.stopInternal(interruptedAt: time)
return time
}
func audioPlayerDidFinishPlaying(_: AVAudioPlayer, successfully flag: Bool) {
self.stopInternal(finished: flag)
}
private func stopInternal(finished: Bool = false, interruptedAt: Double? = nil) {
guard let playback else { return }
let result = TalkPlaybackResult(finished: finished, interruptedAt: interruptedAt)
self.finish(playback: playback, result: result)
}
private func finish(playback: Playback, result: TalkPlaybackResult) {
playback.cancelWatchdog()
playback.finish(result)
guard self.playback === playback else { return }
self.playback = nil
self.player?.stop()
self.player = nil
}
private func stopInternal() {
if let playback = self.playback {
let interruptedAt = self.player?.currentTime
self.finish(
playback: playback,
result: TalkPlaybackResult(finished: false, interruptedAt: interruptedAt))
return
}
self.player?.stop()
self.player = nil
}
private func armWatchdog(playback: Playback) {
playback.setWatchdog(Task { @MainActor [weak self] in
guard let self else { return }
do {
try await Task.sleep(nanoseconds: 650_000_000)
} catch {
return
}
if Task.isCancelled { return }
guard self.playback === playback else { return }
if self.player?.isPlaying != true {
self.logger.error("talk audio player did not start playing")
self.finish(playback: playback, result: TalkPlaybackResult(finished: false, interruptedAt: nil))
return
}
let duration = self.player?.duration ?? 0
let timeoutSeconds = min(max(2.0, duration + 2.0), 5 * 60.0)
do {
try await Task.sleep(nanoseconds: UInt64(timeoutSeconds * 1_000_000_000))
} catch {
return
}
if Task.isCancelled { return }
guard self.playback === playback else { return }
guard self.player?.isPlaying == true else { return }
self.logger.error("talk audio player watchdog fired")
self.finish(playback: playback, result: TalkPlaybackResult(finished: false, interruptedAt: nil))
})
}
}
struct TalkPlaybackResult: Sendable {
let finished: Bool
let interruptedAt: Double?
}

View File

@ -0,0 +1,61 @@
import Observation
@MainActor
@Observable
final class TalkModeController {
static let shared = TalkModeController()
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "talk.controller")
private(set) var phase: TalkModePhase = .idle
private(set) var isPaused: Bool = false
func setEnabled(_ enabled: Bool) async {
self.logger.info("talk enabled=\(enabled)")
if enabled {
TalkOverlayController.shared.present()
} else {
TalkOverlayController.shared.dismiss()
}
await TalkModeRuntime.shared.setEnabled(enabled)
}
func updatePhase(_ phase: TalkModePhase) {
self.phase = phase
TalkOverlayController.shared.updatePhase(phase)
let effectivePhase = self.isPaused ? "paused" : phase.rawValue
Task { await GatewayConnection.shared.talkMode(enabled: AppStateStore.shared.talkEnabled, phase: effectivePhase) }
}
func updateLevel(_ level: Double) {
TalkOverlayController.shared.updateLevel(level)
}
func setPaused(_ paused: Bool) {
guard self.isPaused != paused else { return }
self.logger.info("talk paused=\(paused)")
self.isPaused = paused
TalkOverlayController.shared.updatePaused(paused)
let effectivePhase = paused ? "paused" : self.phase.rawValue
Task { await GatewayConnection.shared.talkMode(enabled: AppStateStore.shared.talkEnabled, phase: effectivePhase) }
Task { await TalkModeRuntime.shared.setPaused(paused) }
}
func togglePaused() {
self.setPaused(!self.isPaused)
}
func stopSpeaking(reason: TalkStopReason = .userTap) {
Task { await TalkModeRuntime.shared.stopSpeaking(reason: reason) }
}
func exitTalkMode() {
Task { await AppStateStore.shared.setTalkEnabled(false) }
}
}
enum TalkStopReason {
case userTap
case speech
case manual
}

View File

@ -0,0 +1,890 @@
import AVFoundation
import ClawdisChatUI
import ClawdisKit
import Foundation
import OSLog
import Speech
actor TalkModeRuntime {
static let shared = TalkModeRuntime()
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "talk.runtime")
private let ttsLogger = Logger(subsystem: "com.steipete.clawdis", category: "talk.tts")
private static let defaultModelIdFallback = "eleven_v3"
private final class RMSMeter: @unchecked Sendable {
private let lock = NSLock()
private var latestRMS: Double = 0
func set(_ rms: Double) {
self.lock.lock()
self.latestRMS = rms
self.lock.unlock()
}
func get() -> Double {
self.lock.lock()
let value = self.latestRMS
self.lock.unlock()
return value
}
}
private var recognizer: SFSpeechRecognizer?
private var audioEngine: AVAudioEngine?
private var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var recognitionTask: SFSpeechRecognitionTask?
private var recognitionGeneration: Int = 0
private var rmsTask: Task<Void, Never>?
private let rmsMeter = RMSMeter()
private var captureTask: Task<Void, Never>?
private var silenceTask: Task<Void, Never>?
private var phase: TalkModePhase = .idle
private var isEnabled = false
private var isPaused = false
private var lifecycleGeneration: Int = 0
private var lastHeard: Date?
private var noiseFloorRMS: Double = 1e-4
private var lastTranscript: String = ""
private var lastSpeechEnergyAt: Date?
private var defaultVoiceId: String?
private var currentVoiceId: String?
private var defaultModelId: String?
private var currentModelId: String?
private var voiceOverrideActive = false
private var modelOverrideActive = false
private var defaultOutputFormat: String?
private var interruptOnSpeech: Bool = true
private var lastInterruptedAtSeconds: Double?
private var voiceAliases: [String: String] = [:]
private var lastSpokenText: String?
private var apiKey: String?
private var fallbackVoiceId: String?
private var lastPlaybackWasPCM: Bool = false
private let silenceWindow: TimeInterval = 0.7
private let minSpeechRMS: Double = 1e-3
private let speechBoostFactor: Double = 6.0
// MARK: - Lifecycle
func setEnabled(_ enabled: Bool) async {
guard enabled != self.isEnabled else { return }
self.isEnabled = enabled
self.lifecycleGeneration &+= 1
if enabled {
await self.start()
} else {
await self.stop()
}
}
func setPaused(_ paused: Bool) async {
guard paused != self.isPaused else { return }
self.isPaused = paused
await MainActor.run { TalkModeController.shared.updateLevel(0) }
guard self.isEnabled else { return }
if paused {
self.lastTranscript = ""
self.lastHeard = nil
self.lastSpeechEnergyAt = nil
await self.stopRecognition()
return
}
if self.phase == .idle || self.phase == .listening {
await self.startRecognition()
self.phase = .listening
await MainActor.run { TalkModeController.shared.updatePhase(.listening) }
self.startSilenceMonitor()
}
}
private func isCurrent(_ generation: Int) -> Bool {
generation == self.lifecycleGeneration && self.isEnabled
}
private func start() async {
let gen = self.lifecycleGeneration
guard voiceWakeSupported else { return }
guard PermissionManager.voiceWakePermissionsGranted() else {
self.logger.debug("talk runtime not starting: permissions missing")
return
}
await self.reloadConfig()
guard self.isCurrent(gen) else { return }
if self.isPaused {
self.phase = .idle
await MainActor.run {
TalkModeController.shared.updateLevel(0)
TalkModeController.shared.updatePhase(.idle)
}
return
}
await self.startRecognition()
guard self.isCurrent(gen) else { return }
self.phase = .listening
await MainActor.run { TalkModeController.shared.updatePhase(.listening) }
self.startSilenceMonitor()
}
private func stop() async {
self.captureTask?.cancel()
self.captureTask = nil
self.silenceTask?.cancel()
self.silenceTask = nil
// Stop audio before changing phase (stopSpeaking is gated on .speaking).
await self.stopSpeaking(reason: .manual)
self.lastTranscript = ""
self.lastHeard = nil
self.lastSpeechEnergyAt = nil
self.phase = .idle
await self.stopRecognition()
await MainActor.run {
TalkModeController.shared.updateLevel(0)
TalkModeController.shared.updatePhase(.idle)
}
}
// MARK: - Speech recognition
private struct RecognitionUpdate {
let transcript: String?
let hasConfidence: Bool
let isFinal: Bool
let errorDescription: String?
let generation: Int
}
private func startRecognition() async {
await self.stopRecognition()
self.recognitionGeneration &+= 1
let generation = self.recognitionGeneration
let locale = await MainActor.run { AppStateStore.shared.voiceWakeLocaleID }
self.recognizer = SFSpeechRecognizer(locale: Locale(identifier: locale))
guard let recognizer, recognizer.isAvailable else {
self.logger.error("talk recognizer unavailable")
return
}
self.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
self.recognitionRequest?.shouldReportPartialResults = true
guard let request = self.recognitionRequest else { return }
if self.audioEngine == nil {
self.audioEngine = AVAudioEngine()
}
guard let audioEngine = self.audioEngine else { return }
let input = audioEngine.inputNode
let format = input.outputFormat(forBus: 0)
input.removeTap(onBus: 0)
let meter = self.rmsMeter
input.installTap(onBus: 0, bufferSize: 2048, format: format) { [weak request, meter] buffer, _ in
request?.append(buffer)
if let rms = Self.rmsLevel(buffer: buffer) {
meter.set(rms)
}
}
audioEngine.prepare()
do {
try audioEngine.start()
} catch {
self.logger.error("talk audio engine start failed: \(error.localizedDescription, privacy: .public)")
return
}
self.startRMSTicker(meter: meter)
self.recognitionTask = recognizer.recognitionTask(with: request) { [weak self, generation] result, error in
guard let self else { return }
let segments = result?.bestTranscription.segments ?? []
let transcript = result?.bestTranscription.formattedString
let update = RecognitionUpdate(
transcript: transcript,
hasConfidence: segments.contains { $0.confidence > 0.6 },
isFinal: result?.isFinal ?? false,
errorDescription: error?.localizedDescription,
generation: generation)
Task { await self.handleRecognition(update) }
}
}
private func stopRecognition() async {
self.recognitionGeneration &+= 1
self.recognitionTask?.cancel()
self.recognitionTask = nil
self.recognitionRequest?.endAudio()
self.recognitionRequest = nil
self.audioEngine?.inputNode.removeTap(onBus: 0)
self.audioEngine?.stop()
self.audioEngine = nil
self.recognizer = nil
self.rmsTask?.cancel()
self.rmsTask = nil
}
private func startRMSTicker(meter: RMSMeter) {
self.rmsTask?.cancel()
self.rmsTask = Task { [weak self, meter] in
while let self {
try? await Task.sleep(nanoseconds: 50_000_000)
if Task.isCancelled { return }
await self.noteAudioLevel(rms: meter.get())
}
}
}
private func handleRecognition(_ update: RecognitionUpdate) async {
guard update.generation == self.recognitionGeneration else { return }
guard !self.isPaused else { return }
if let errorDescription = update.errorDescription {
self.logger.debug("talk recognition error: \(errorDescription, privacy: .public)")
}
guard let transcript = update.transcript else { return }
let trimmed = transcript.trimmingCharacters(in: .whitespacesAndNewlines)
if self.phase == .speaking, self.interruptOnSpeech {
if await self.shouldInterrupt(transcript: trimmed, hasConfidence: update.hasConfidence) {
await self.stopSpeaking(reason: .speech)
self.lastTranscript = ""
self.lastHeard = nil
await self.startListening()
}
return
}
guard self.phase == .listening else { return }
if !trimmed.isEmpty {
self.lastTranscript = trimmed
self.lastHeard = Date()
}
if update.isFinal {
self.lastTranscript = trimmed
}
}
// MARK: - Silence handling
private func startSilenceMonitor() {
self.silenceTask?.cancel()
self.silenceTask = Task { [weak self] in
await self?.silenceLoop()
}
}
private func silenceLoop() async {
while self.isEnabled {
try? await Task.sleep(nanoseconds: 200_000_000)
await self.checkSilence()
}
}
private func checkSilence() async {
guard !self.isPaused else { return }
guard self.phase == .listening else { return }
let transcript = self.lastTranscript.trimmingCharacters(in: .whitespacesAndNewlines)
guard !transcript.isEmpty else { return }
guard let lastHeard else { return }
let elapsed = Date().timeIntervalSince(lastHeard)
guard elapsed >= self.silenceWindow else { return }
await self.finalizeTranscript(transcript)
}
private func startListening() async {
self.phase = .listening
self.lastTranscript = ""
self.lastHeard = nil
await MainActor.run {
TalkModeController.shared.updatePhase(.listening)
TalkModeController.shared.updateLevel(0)
}
}
private func finalizeTranscript(_ text: String) async {
self.lastTranscript = ""
self.lastHeard = nil
self.phase = .thinking
await MainActor.run { TalkModeController.shared.updatePhase(.thinking) }
await self.stopRecognition()
await self.sendAndSpeak(text)
}
// MARK: - Gateway + TTS
private func sendAndSpeak(_ transcript: String) async {
let gen = self.lifecycleGeneration
await self.reloadConfig()
guard self.isCurrent(gen) else { return }
let prompt = self.buildPrompt(transcript: transcript)
let activeSessionKey = await MainActor.run { WebChatManager.shared.activeSessionKey }
let sessionKey: String = if let activeSessionKey {
activeSessionKey
} else {
await GatewayConnection.shared.mainSessionKey()
}
let runId = UUID().uuidString
let startedAt = Date().timeIntervalSince1970
self.logger.info(
"talk send start runId=\(runId, privacy: .public) session=\(sessionKey, privacy: .public) chars=\(prompt.count, privacy: .public)")
do {
let response = try await GatewayConnection.shared.chatSend(
sessionKey: sessionKey,
message: prompt,
thinking: "low",
idempotencyKey: runId,
attachments: [])
guard self.isCurrent(gen) else { return }
self.logger.info(
"talk chat.send ok runId=\(response.runId, privacy: .public) session=\(sessionKey, privacy: .public)")
guard let assistantText = await self.waitForAssistantText(
sessionKey: sessionKey,
since: startedAt,
timeoutSeconds: 45)
else {
self.logger.warning("talk assistant text missing after timeout")
await self.startListening()
await self.startRecognition()
return
}
guard self.isCurrent(gen) else { return }
self.logger.info("talk assistant text len=\(assistantText.count, privacy: .public)")
await self.playAssistant(text: assistantText)
guard self.isCurrent(gen) else { return }
await self.resumeListeningIfNeeded()
return
} catch {
self.logger.error("talk chat.send failed: \(error.localizedDescription, privacy: .public)")
await self.resumeListeningIfNeeded()
return
}
}
private func resumeListeningIfNeeded() async {
if self.isPaused {
self.lastTranscript = ""
self.lastHeard = nil
self.lastSpeechEnergyAt = nil
await MainActor.run {
TalkModeController.shared.updateLevel(0)
}
return
}
await self.startListening()
await self.startRecognition()
}
private func buildPrompt(transcript: String) -> String {
let interrupted = self.lastInterruptedAtSeconds
self.lastInterruptedAtSeconds = nil
return TalkPromptBuilder.build(transcript: transcript, interruptedAtSeconds: interrupted)
}
private func waitForAssistantText(
sessionKey: String,
since: Double,
timeoutSeconds: Int) async -> String?
{
let deadline = Date().addingTimeInterval(TimeInterval(timeoutSeconds))
while Date() < deadline {
if let text = await self.latestAssistantText(sessionKey: sessionKey, since: since) {
return text
}
try? await Task.sleep(nanoseconds: 300_000_000)
}
return nil
}
private func latestAssistantText(sessionKey: String, since: Double? = nil) async -> String? {
do {
let history = try await GatewayConnection.shared.chatHistory(sessionKey: sessionKey)
let messages = history.messages ?? []
let decoded: [ClawdisChatMessage] = messages.compactMap { item in
guard let data = try? JSONEncoder().encode(item) else { return nil }
return try? JSONDecoder().decode(ClawdisChatMessage.self, from: data)
}
let assistant = decoded.last { message in
guard message.role == "assistant" else { return false }
guard let since else { return true }
guard let timestamp = message.timestamp else { return false }
return TalkHistoryTimestamp.isAfter(timestamp, sinceSeconds: since)
}
guard let assistant else { return nil }
let text = assistant.content.compactMap(\.text).joined(separator: "\n")
let trimmed = text.trimmingCharacters(in: CharacterSet.whitespacesAndNewlines)
return trimmed.isEmpty ? nil : trimmed
} catch {
self.logger.error("talk history fetch failed: \(error.localizedDescription, privacy: .public)")
return nil
}
}
private func playAssistant(text: String) async {
let gen = self.lifecycleGeneration
let parse = TalkDirectiveParser.parse(text)
let directive = parse.directive
let cleaned = parse.stripped.trimmingCharacters(in: .whitespacesAndNewlines)
guard !cleaned.isEmpty else { return }
guard self.isCurrent(gen) else { return }
if !parse.unknownKeys.isEmpty {
self.logger
.warning("talk directive ignored keys: \(parse.unknownKeys.joined(separator: ","), privacy: .public)")
}
let requestedVoice = directive?.voiceId?.trimmingCharacters(in: .whitespacesAndNewlines)
let resolvedVoice = self.resolveVoiceAlias(requestedVoice)
if let requestedVoice, !requestedVoice.isEmpty, resolvedVoice == nil {
self.logger.warning("talk unknown voice alias \(requestedVoice, privacy: .public)")
}
if let voice = resolvedVoice {
if directive?.once == true {
self.logger.info("talk voice override (once) voiceId=\(voice, privacy: .public)")
} else {
self.currentVoiceId = voice
self.voiceOverrideActive = true
self.logger.info("talk voice override voiceId=\(voice, privacy: .public)")
}
}
if let model = directive?.modelId {
if directive?.once == true {
self.logger.info("talk model override (once) modelId=\(model, privacy: .public)")
} else {
self.currentModelId = model
self.modelOverrideActive = true
}
}
let apiKey = self.apiKey?.trimmingCharacters(in: .whitespacesAndNewlines)
let preferredVoice =
resolvedVoice ??
self.currentVoiceId ??
self.defaultVoiceId
let language = ElevenLabsTTSClient.validatedLanguage(directive?.language)
let voiceId: String? = if let apiKey, !apiKey.isEmpty {
await self.resolveVoiceId(preferred: preferredVoice, apiKey: apiKey)
} else {
nil
}
if apiKey?.isEmpty != false {
self.ttsLogger.warning("talk missing ELEVENLABS_API_KEY; falling back to system voice")
} else if voiceId == nil {
self.ttsLogger.warning("talk missing voiceId; falling back to system voice")
} else if let voiceId {
self.ttsLogger
.info("talk TTS request voiceId=\(voiceId, privacy: .public) chars=\(cleaned.count, privacy: .public)")
}
self.lastSpokenText = cleaned
let synthTimeoutSeconds = max(20.0, min(90.0, Double(cleaned.count) * 0.12))
do {
if let apiKey, !apiKey.isEmpty, let voiceId {
let desiredOutputFormat = directive?.outputFormat ?? self.defaultOutputFormat ?? "pcm_44100"
let outputFormat = ElevenLabsTTSClient.validatedOutputFormat(desiredOutputFormat)
if outputFormat == nil, !desiredOutputFormat.isEmpty {
self.logger
.warning(
"talk output_format unsupported for local playback: \(desiredOutputFormat, privacy: .public)")
}
let modelId = directive?.modelId ?? self.currentModelId ?? self.defaultModelId
func makeRequest(outputFormat: String?) -> ElevenLabsTTSRequest {
ElevenLabsTTSRequest(
text: cleaned,
modelId: modelId,
outputFormat: outputFormat,
speed: TalkTTSValidation.resolveSpeed(speed: directive?.speed, rateWPM: directive?.rateWPM),
stability: TalkTTSValidation.validatedStability(directive?.stability, modelId: modelId),
similarity: TalkTTSValidation.validatedUnit(directive?.similarity),
style: TalkTTSValidation.validatedUnit(directive?.style),
speakerBoost: directive?.speakerBoost,
seed: TalkTTSValidation.validatedSeed(directive?.seed),
normalize: ElevenLabsTTSClient.validatedNormalize(directive?.normalize),
language: language,
latencyTier: TalkTTSValidation.validatedLatencyTier(directive?.latencyTier))
}
let request = makeRequest(outputFormat: outputFormat)
self.ttsLogger.info("talk TTS synth timeout=\(synthTimeoutSeconds, privacy: .public)s")
let client = ElevenLabsTTSClient(apiKey: apiKey)
let stream = client.streamSynthesize(voiceId: voiceId, request: request)
guard self.isCurrent(gen) else { return }
if self.interruptOnSpeech {
await self.startRecognition()
guard self.isCurrent(gen) else { return }
}
await MainActor.run { TalkModeController.shared.updatePhase(.speaking) }
self.phase = .speaking
let sampleRate = TalkTTSValidation.pcmSampleRate(from: outputFormat)
var result: StreamingPlaybackResult
if let sampleRate {
self.lastPlaybackWasPCM = true
result = await self.playPCM(stream: stream, sampleRate: sampleRate)
if !result.finished, result.interruptedAt == nil {
let mp3Format = ElevenLabsTTSClient.validatedOutputFormat("mp3_44100")
self.ttsLogger.warning("talk pcm playback failed; retrying mp3")
self.lastPlaybackWasPCM = false
let mp3Stream = client.streamSynthesize(
voiceId: voiceId,
request: makeRequest(outputFormat: mp3Format))
result = await self.playMP3(stream: mp3Stream)
}
} else {
self.lastPlaybackWasPCM = false
result = await self.playMP3(stream: stream)
}
self.ttsLogger
.info(
"talk audio result finished=\(result.finished, privacy: .public) interruptedAt=\(String(describing: result.interruptedAt), privacy: .public)")
if !result.finished, result.interruptedAt == nil {
throw NSError(domain: "StreamingAudioPlayer", code: 1, userInfo: [
NSLocalizedDescriptionKey: "audio playback failed",
])
}
if !result.finished, let interruptedAt = result.interruptedAt, self.phase == .speaking {
if self.interruptOnSpeech {
self.lastInterruptedAtSeconds = interruptedAt
}
}
} else {
self.ttsLogger.info("talk system voice start chars=\(cleaned.count, privacy: .public)")
if self.interruptOnSpeech {
await self.startRecognition()
guard self.isCurrent(gen) else { return }
}
await MainActor.run { TalkModeController.shared.updatePhase(.speaking) }
self.phase = .speaking
await TalkSystemSpeechSynthesizer.shared.stop()
try await TalkSystemSpeechSynthesizer.shared.speak(text: cleaned, language: language)
self.ttsLogger.info("talk system voice done")
}
} catch {
self.ttsLogger
.error("talk TTS failed: \(error.localizedDescription, privacy: .public); falling back to system voice")
do {
if self.interruptOnSpeech {
await self.startRecognition()
guard self.isCurrent(gen) else { return }
}
await MainActor.run { TalkModeController.shared.updatePhase(.speaking) }
self.phase = .speaking
await TalkSystemSpeechSynthesizer.shared.stop()
try await TalkSystemSpeechSynthesizer.shared.speak(text: cleaned, language: language)
} catch {
self.ttsLogger.error("talk system voice failed: \(error.localizedDescription, privacy: .public)")
}
}
if self.phase == .speaking {
self.phase = .thinking
await MainActor.run { TalkModeController.shared.updatePhase(.thinking) }
}
}
private func resolveVoiceId(preferred: String?, apiKey: String) async -> String? {
let trimmed = preferred?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
if !trimmed.isEmpty {
if let resolved = self.resolveVoiceAlias(trimmed) { return resolved }
self.ttsLogger.warning("talk unknown voice alias \(trimmed, privacy: .public)")
}
if let fallbackVoiceId { return fallbackVoiceId }
do {
let voices = try await ElevenLabsTTSClient(apiKey: apiKey).listVoices()
guard let first = voices.first else {
self.ttsLogger.error("elevenlabs voices list empty")
return nil
}
self.fallbackVoiceId = first.voiceId
if self.defaultVoiceId == nil {
self.defaultVoiceId = first.voiceId
}
if !self.voiceOverrideActive {
self.currentVoiceId = first.voiceId
}
let name = first.name ?? "unknown"
self.ttsLogger
.info("talk default voice selected \(name, privacy: .public) (\(first.voiceId, privacy: .public))")
return first.voiceId
} catch {
self.ttsLogger.error("elevenlabs list voices failed: \(error.localizedDescription, privacy: .public)")
return nil
}
}
private func resolveVoiceAlias(_ value: String?) -> String? {
let trimmed = (value ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let normalized = trimmed.lowercased()
if let mapped = self.voiceAliases[normalized] { return mapped }
if self.voiceAliases.values.contains(where: { $0.caseInsensitiveCompare(trimmed) == .orderedSame }) {
return trimmed
}
return Self.isLikelyVoiceId(trimmed) ? trimmed : nil
}
private static func isLikelyVoiceId(_ value: String) -> Bool {
guard value.count >= 10 else { return false }
return value.allSatisfy { $0.isLetter || $0.isNumber || $0 == "-" || $0 == "_" }
}
func stopSpeaking(reason: TalkStopReason) async {
let usePCM = self.lastPlaybackWasPCM
let interruptedAt = usePCM ? await self.stopPCM() : await self.stopMP3()
_ = usePCM ? await self.stopMP3() : await self.stopPCM()
await TalkSystemSpeechSynthesizer.shared.stop()
guard self.phase == .speaking else { return }
if reason == .speech, let interruptedAt {
self.lastInterruptedAtSeconds = interruptedAt
}
if reason == .manual {
return
}
if reason == .speech || reason == .userTap {
await self.startListening()
return
}
self.phase = .thinking
await MainActor.run { TalkModeController.shared.updatePhase(.thinking) }
}
// MARK: - Audio playback (MainActor helpers)
@MainActor
private func playPCM(
stream: AsyncThrowingStream<Data, Error>,
sampleRate: Double) async -> StreamingPlaybackResult
{
await PCMStreamingAudioPlayer.shared.play(stream: stream, sampleRate: sampleRate)
}
@MainActor
private func playMP3(stream: AsyncThrowingStream<Data, Error>) async -> StreamingPlaybackResult {
await StreamingAudioPlayer.shared.play(stream: stream)
}
@MainActor
private func stopPCM() -> Double? {
PCMStreamingAudioPlayer.shared.stop()
}
@MainActor
private func stopMP3() -> Double? {
StreamingAudioPlayer.shared.stop()
}
// MARK: - Config
private func reloadConfig() async {
let cfg = await self.fetchTalkConfig()
self.defaultVoiceId = cfg.voiceId
self.voiceAliases = cfg.voiceAliases
if !self.voiceOverrideActive {
self.currentVoiceId = cfg.voiceId
}
self.defaultModelId = cfg.modelId
if !self.modelOverrideActive {
self.currentModelId = cfg.modelId
}
self.defaultOutputFormat = cfg.outputFormat
self.interruptOnSpeech = cfg.interruptOnSpeech
self.apiKey = cfg.apiKey
let hasApiKey = (cfg.apiKey?.isEmpty == false)
let voiceLabel = (cfg.voiceId?.isEmpty == false) ? cfg.voiceId! : "none"
let modelLabel = (cfg.modelId?.isEmpty == false) ? cfg.modelId! : "none"
self.logger
.info(
"talk config voiceId=\(voiceLabel, privacy: .public) modelId=\(modelLabel, privacy: .public) apiKey=\(hasApiKey, privacy: .public) interrupt=\(cfg.interruptOnSpeech, privacy: .public)")
}
private struct TalkRuntimeConfig {
let voiceId: String?
let voiceAliases: [String: String]
let modelId: String?
let outputFormat: String?
let interruptOnSpeech: Bool
let apiKey: String?
}
private func fetchTalkConfig() async -> TalkRuntimeConfig {
let env = ProcessInfo.processInfo.environment
let envVoice = env["ELEVENLABS_VOICE_ID"]?.trimmingCharacters(in: .whitespacesAndNewlines)
let sagVoice = env["SAG_VOICE_ID"]?.trimmingCharacters(in: .whitespacesAndNewlines)
let envApiKey = env["ELEVENLABS_API_KEY"]?.trimmingCharacters(in: .whitespacesAndNewlines)
do {
let snap: ConfigSnapshot = try await GatewayConnection.shared.requestDecoded(
method: .configGet,
params: nil,
timeoutMs: 8000)
let talk = snap.config?["talk"]?.dictionaryValue
let ui = snap.config?["ui"]?.dictionaryValue
let rawSeam = ui?["seamColor"]?.stringValue?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
await MainActor.run {
AppStateStore.shared.seamColorHex = rawSeam.isEmpty ? nil : rawSeam
}
let voice = talk?["voiceId"]?.stringValue
let rawAliases = talk?["voiceAliases"]?.dictionaryValue
let resolvedAliases: [String: String] =
rawAliases?.reduce(into: [:]) { acc, entry in
let key = entry.key.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
let value = entry.value.stringValue?.trimmingCharacters(in: .whitespacesAndNewlines) ?? ""
guard !key.isEmpty, !value.isEmpty else { return }
acc[key] = value
} ?? [:]
let model = talk?["modelId"]?.stringValue?.trimmingCharacters(in: .whitespacesAndNewlines)
let resolvedModel = (model?.isEmpty == false) ? model! : Self.defaultModelIdFallback
let outputFormat = talk?["outputFormat"]?.stringValue
let interrupt = talk?["interruptOnSpeech"]?.boolValue
let apiKey = talk?["apiKey"]?.stringValue
let resolvedVoice =
(voice?.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty == false ? voice : nil) ??
(envVoice?.isEmpty == false ? envVoice : nil) ??
(sagVoice?.isEmpty == false ? sagVoice : nil)
let resolvedApiKey =
(envApiKey?.isEmpty == false ? envApiKey : nil) ??
(apiKey?.trimmingCharacters(in: .whitespacesAndNewlines).isEmpty == false ? apiKey : nil)
return TalkRuntimeConfig(
voiceId: resolvedVoice,
voiceAliases: resolvedAliases,
modelId: resolvedModel,
outputFormat: outputFormat,
interruptOnSpeech: interrupt ?? true,
apiKey: resolvedApiKey)
} catch {
let resolvedVoice =
(envVoice?.isEmpty == false ? envVoice : nil) ??
(sagVoice?.isEmpty == false ? sagVoice : nil)
let resolvedApiKey = envApiKey?.isEmpty == false ? envApiKey : nil
return TalkRuntimeConfig(
voiceId: resolvedVoice,
voiceAliases: [:],
modelId: Self.defaultModelIdFallback,
outputFormat: nil,
interruptOnSpeech: true,
apiKey: resolvedApiKey)
}
}
// MARK: - Audio level handling
private func noteAudioLevel(rms: Double) async {
if self.phase != .listening, self.phase != .speaking { return }
let alpha: Double = rms < self.noiseFloorRMS ? 0.08 : 0.01
self.noiseFloorRMS = max(1e-7, self.noiseFloorRMS + (rms - self.noiseFloorRMS) * alpha)
let threshold = max(self.minSpeechRMS, self.noiseFloorRMS * self.speechBoostFactor)
if rms >= threshold {
let now = Date()
self.lastHeard = now
self.lastSpeechEnergyAt = now
}
if self.phase == .listening {
let clamped = min(1.0, max(0.0, rms / max(self.minSpeechRMS, threshold)))
await MainActor.run { TalkModeController.shared.updateLevel(clamped) }
}
}
private static func rmsLevel(buffer: AVAudioPCMBuffer) -> Double? {
guard let channelData = buffer.floatChannelData?.pointee else { return nil }
let frameCount = Int(buffer.frameLength)
guard frameCount > 0 else { return nil }
var sum: Double = 0
for i in 0..<frameCount {
let sample = Double(channelData[i])
sum += sample * sample
}
return sqrt(sum / Double(frameCount))
}
private func shouldInterrupt(transcript: String, hasConfidence: Bool) async -> Bool {
let trimmed = transcript.trimmingCharacters(in: .whitespacesAndNewlines)
guard trimmed.count >= 3 else { return false }
if self.isLikelyEcho(of: trimmed) { return false }
let now = Date()
if let lastSpeechEnergyAt, now.timeIntervalSince(lastSpeechEnergyAt) > 0.35 {
return false
}
return hasConfidence
}
private func isLikelyEcho(of transcript: String) -> Bool {
guard let spoken = self.lastSpokenText?.lowercased(), !spoken.isEmpty else { return false }
let probe = transcript.lowercased()
if probe.count < 6 {
return spoken.contains(probe)
}
return spoken.contains(probe)
}
private static func resolveSpeed(speed: Double?, rateWPM: Int?, logger: Logger) -> Double? {
if let rateWPM, rateWPM > 0 {
let resolved = Double(rateWPM) / 175.0
if resolved <= 0.5 || resolved >= 2.0 {
logger.warning("talk rateWPM out of range: \(rateWPM, privacy: .public)")
return nil
}
return resolved
}
if let speed {
if speed <= 0.5 || speed >= 2.0 {
logger.warning("talk speed out of range: \(speed, privacy: .public)")
return nil
}
return speed
}
return nil
}
private static func validatedUnit(_ value: Double?, name: String, logger: Logger) -> Double? {
guard let value else { return nil }
if value < 0 || value > 1 {
logger.warning("talk \(name, privacy: .public) out of range: \(value, privacy: .public)")
return nil
}
return value
}
private static func validatedSeed(_ value: Int?, logger: Logger) -> UInt32? {
guard let value else { return nil }
if value < 0 || value > 4_294_967_295 {
logger.warning("talk seed out of range: \(value, privacy: .public)")
return nil
}
return UInt32(value)
}
private static func validatedNormalize(_ value: String?, logger: Logger) -> String? {
guard let value else { return nil }
let normalized = value.trimmingCharacters(in: .whitespacesAndNewlines).lowercased()
guard ["auto", "on", "off"].contains(normalized) else {
logger.warning("talk normalize invalid: \(normalized, privacy: .public)")
return nil
}
return normalized
}
}

View File

@ -0,0 +1,8 @@
import Foundation
enum TalkModePhase: String {
case idle
case listening
case thinking
case speaking
}

View File

@ -0,0 +1,146 @@
import AppKit
import Observation
import OSLog
import SwiftUI
@MainActor
@Observable
final class TalkOverlayController {
static let shared = TalkOverlayController()
static let overlaySize: CGFloat = 440
static let orbSize: CGFloat = 96
static let orbPadding: CGFloat = 12
static let orbHitSlop: CGFloat = 10
private let logger = Logger(subsystem: "com.steipete.clawdis", category: "talk.overlay")
struct Model {
var isVisible: Bool = false
var phase: TalkModePhase = .idle
var isPaused: Bool = false
var level: Double = 0
}
var model = Model()
private var window: NSPanel?
private var hostingView: NSHostingView<TalkOverlayView>?
private let screenInset: CGFloat = 0
func present() {
self.ensureWindow()
self.hostingView?.rootView = TalkOverlayView(controller: self)
let target = self.targetFrame()
guard let window else { return }
if !self.model.isVisible {
self.model.isVisible = true
let start = target.offsetBy(dx: 0, dy: -6)
window.setFrame(start, display: true)
window.alphaValue = 0
window.orderFrontRegardless()
NSAnimationContext.runAnimationGroup { context in
context.duration = 0.18
context.timingFunction = CAMediaTimingFunction(name: .easeOut)
window.animator().setFrame(target, display: true)
window.animator().alphaValue = 1
}
} else {
window.setFrame(target, display: true)
window.orderFrontRegardless()
}
}
func dismiss() {
guard let window else {
self.model.isVisible = false
return
}
let target = window.frame.offsetBy(dx: 6, dy: 6)
NSAnimationContext.runAnimationGroup { context in
context.duration = 0.16
context.timingFunction = CAMediaTimingFunction(name: .easeOut)
window.animator().setFrame(target, display: true)
window.animator().alphaValue = 0
} completionHandler: {
Task { @MainActor in
window.orderOut(nil)
self.model.isVisible = false
}
}
}
func updatePhase(_ phase: TalkModePhase) {
guard self.model.phase != phase else { return }
self.logger.info("talk overlay phase=\(phase.rawValue, privacy: .public)")
self.model.phase = phase
}
func updatePaused(_ paused: Bool) {
guard self.model.isPaused != paused else { return }
self.logger.info("talk overlay paused=\(paused)")
self.model.isPaused = paused
}
func updateLevel(_ level: Double) {
guard self.model.isVisible else { return }
self.model.level = max(0, min(1, level))
}
func currentWindowOrigin() -> CGPoint? {
self.window?.frame.origin
}
func setWindowOrigin(_ origin: CGPoint) {
guard let window else { return }
window.setFrameOrigin(origin)
}
// MARK: - Private
private func ensureWindow() {
if self.window != nil { return }
let panel = NSPanel(
contentRect: NSRect(x: 0, y: 0, width: Self.overlaySize, height: Self.overlaySize),
styleMask: [.nonactivatingPanel, .borderless],
backing: .buffered,
defer: false)
panel.isOpaque = false
panel.backgroundColor = .clear
panel.hasShadow = false
panel.level = NSWindow.Level(rawValue: NSWindow.Level.popUpMenu.rawValue - 4)
panel.collectionBehavior = [.canJoinAllSpaces, .fullScreenAuxiliary, .transient]
panel.hidesOnDeactivate = false
panel.isMovable = false
panel.acceptsMouseMovedEvents = true
panel.isFloatingPanel = true
panel.becomesKeyOnlyIfNeeded = true
panel.titleVisibility = .hidden
panel.titlebarAppearsTransparent = true
let host = TalkOverlayHostingView(rootView: TalkOverlayView(controller: self))
host.translatesAutoresizingMaskIntoConstraints = false
panel.contentView = host
self.hostingView = host
self.window = panel
}
private func targetFrame() -> NSRect {
let screen = self.window?.screen
?? NSScreen.main
?? NSScreen.screens.first
guard let screen else { return .zero }
let size = NSSize(width: Self.overlaySize, height: Self.overlaySize)
let visible = screen.visibleFrame
let origin = CGPoint(
x: visible.maxX - size.width - self.screenInset,
y: visible.maxY - size.height - self.screenInset)
return NSRect(origin: origin, size: size)
}
}
private final class TalkOverlayHostingView: NSHostingView<TalkOverlayView> {
override func acceptsFirstMouse(for event: NSEvent?) -> Bool {
true
}
}

View File

@ -0,0 +1,219 @@
import AppKit
import SwiftUI
struct TalkOverlayView: View {
var controller: TalkOverlayController
@State private var appState = AppStateStore.shared
@State private var hoveringWindow = false
var body: some View {
ZStack(alignment: .topTrailing) {
let isPaused = self.controller.model.isPaused
Color.clear
TalkOrbView(
phase: self.controller.model.phase,
level: self.controller.model.level,
accent: self.seamColor,
isPaused: isPaused)
.frame(width: TalkOverlayController.orbSize, height: TalkOverlayController.orbSize)
.padding(.top, TalkOverlayController.orbPadding)
.padding(.trailing, TalkOverlayController.orbPadding)
.contentShape(Circle())
.opacity(isPaused ? 0.55 : 1)
.background(
TalkOrbInteractionView(
onSingleClick: { TalkModeController.shared.togglePaused() },
onDoubleClick: { TalkModeController.shared.stopSpeaking(reason: .userTap) },
onDragStart: { TalkModeController.shared.setPaused(true) }))
.overlay(alignment: .topLeading) {
Button {
TalkModeController.shared.exitTalkMode()
} label: {
Image(systemName: "xmark")
.font(.system(size: 10, weight: .bold))
.foregroundStyle(Color.white.opacity(0.95))
.frame(width: 18, height: 18)
.background(Color.black.opacity(0.4))
.clipShape(Circle())
}
.buttonStyle(.plain)
.contentShape(Circle())
.offset(x: -2, y: -2)
.opacity(self.hoveringWindow ? 1 : 0)
.animation(.easeOut(duration: 0.12), value: self.hoveringWindow)
}
.onHover { self.hoveringWindow = $0 }
}
.frame(
width: TalkOverlayController.overlaySize,
height: TalkOverlayController.overlaySize,
alignment: .topTrailing)
}
private static let defaultSeamColor = Color(red: 79 / 255.0, green: 122 / 255.0, blue: 154 / 255.0)
private var seamColor: Color {
Self.color(fromHex: self.appState.seamColorHex) ?? Self.defaultSeamColor
}
private static func color(fromHex raw: String?) -> Color? {
let trimmed = (raw ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let hex = trimmed.hasPrefix("#") ? String(trimmed.dropFirst()) : trimmed
guard hex.count == 6, let value = Int(hex, radix: 16) else { return nil }
let r = Double((value >> 16) & 0xFF) / 255.0
let g = Double((value >> 8) & 0xFF) / 255.0
let b = Double(value & 0xFF) / 255.0
return Color(red: r, green: g, blue: b)
}
}
private struct TalkOrbInteractionView: NSViewRepresentable {
let onSingleClick: () -> Void
let onDoubleClick: () -> Void
let onDragStart: () -> Void
func makeNSView(context: Context) -> NSView {
let view = OrbInteractionNSView()
view.onSingleClick = self.onSingleClick
view.onDoubleClick = self.onDoubleClick
view.onDragStart = self.onDragStart
view.wantsLayer = true
view.layer?.backgroundColor = NSColor.clear.cgColor
return view
}
func updateNSView(_ nsView: NSView, context: Context) {
guard let view = nsView as? OrbInteractionNSView else { return }
view.onSingleClick = self.onSingleClick
view.onDoubleClick = self.onDoubleClick
view.onDragStart = self.onDragStart
}
}
private final class OrbInteractionNSView: NSView {
var onSingleClick: (() -> Void)?
var onDoubleClick: (() -> Void)?
var onDragStart: (() -> Void)?
private var mouseDownEvent: NSEvent?
private var didDrag = false
private var suppressSingleClick = false
override var acceptsFirstResponder: Bool { true }
override func acceptsFirstMouse(for event: NSEvent?) -> Bool { true }
override func mouseDown(with event: NSEvent) {
self.mouseDownEvent = event
self.didDrag = false
self.suppressSingleClick = event.clickCount > 1
if event.clickCount == 2 {
self.onDoubleClick?()
}
}
override func mouseDragged(with event: NSEvent) {
guard let startEvent = self.mouseDownEvent else { return }
if !self.didDrag {
let dx = event.locationInWindow.x - startEvent.locationInWindow.x
let dy = event.locationInWindow.y - startEvent.locationInWindow.y
if abs(dx) + abs(dy) < 2 { return }
self.didDrag = true
self.onDragStart?()
self.window?.performDrag(with: startEvent)
}
}
override func mouseUp(with event: NSEvent) {
if !self.didDrag && !self.suppressSingleClick {
self.onSingleClick?()
}
self.mouseDownEvent = nil
self.didDrag = false
self.suppressSingleClick = false
}
}
private struct TalkOrbView: View {
let phase: TalkModePhase
let level: Double
let accent: Color
let isPaused: Bool
var body: some View {
if self.isPaused {
Circle()
.fill(self.orbGradient)
.overlay(Circle().stroke(Color.white.opacity(0.35), lineWidth: 1))
.shadow(color: Color.black.opacity(0.18), radius: 10, x: 0, y: 5)
} else {
TimelineView(.animation) { context in
let t = context.date.timeIntervalSinceReferenceDate
let listenScale = phase == .listening ? (1 + CGFloat(self.level) * 0.12) : 1
let pulse = phase == .speaking ? (1 + 0.06 * sin(t * 6)) : 1
ZStack {
Circle()
.fill(self.orbGradient)
.overlay(Circle().stroke(Color.white.opacity(0.45), lineWidth: 1))
.shadow(color: Color.black.opacity(0.22), radius: 10, x: 0, y: 5)
.scaleEffect(pulse * listenScale)
TalkWaveRings(phase: phase, level: level, time: t, accent: self.accent)
if phase == .thinking {
TalkOrbitArcs(time: t)
}
}
}
}
}
private var orbGradient: RadialGradient {
RadialGradient(
colors: [Color.white, self.accent],
center: .topLeading,
startRadius: 4,
endRadius: 52)
}
}
private struct TalkWaveRings: View {
let phase: TalkModePhase
let level: Double
let time: TimeInterval
let accent: Color
var body: some View {
ZStack {
ForEach(0..<3, id: \.self) { idx in
let speed = phase == .speaking ? 1.4 : phase == .listening ? 0.9 : 0.6
let progress = (time * speed + Double(idx) * 0.28).truncatingRemainder(dividingBy: 1)
let amplitude = phase == .speaking ? 0.95 : phase == .listening ? 0.5 + level * 0.7 : 0.35
let scale = 0.75 + progress * amplitude + (phase == .listening ? level * 0.15 : 0)
let alpha = phase == .speaking ? 0.72 : phase == .listening ? 0.58 + level * 0.28 : 0.4
Circle()
.stroke(self.accent.opacity(alpha - progress * 0.3), lineWidth: 1.6)
.scaleEffect(scale)
.opacity(alpha - progress * 0.6)
}
}
}
}
private struct TalkOrbitArcs: View {
let time: TimeInterval
var body: some View {
ZStack {
Circle()
.trim(from: 0.08, to: 0.26)
.stroke(Color.white.opacity(0.88), style: StrokeStyle(lineWidth: 1.6, lineCap: .round))
.rotationEffect(.degrees(time * 42))
Circle()
.trim(from: 0.62, to: 0.86)
.stroke(Color.white.opacity(0.7), style: StrokeStyle(lineWidth: 1.4, lineCap: .round))
.rotationEffect(.degrees(-time * 35))
}
.scaleEffect(1.08)
}
}

View File

@ -1,7 +1,6 @@
import AppKit
import Foundation
import Observation
import OSLog
@MainActor
@Observable

View File

@ -1,6 +1,5 @@
import AppKit
import Observation
import OSLog
import SwiftUI
/// Lightweight, borderless panel that shows the current voice wake transcript near the menu bar.

View File

@ -1,6 +1,5 @@
import AVFoundation
import Foundation
import OSLog
import Speech
import SwabbleKit

View File

@ -29,6 +29,10 @@ final class WebChatManager {
var onPanelVisibilityChanged: ((Bool) -> Void)?
var activeSessionKey: String? {
self.panelSessionKey ?? self.windowSessionKey
}
func show(sessionKey: String) {
self.closePanel()
if let controller = self.windowController {

View File

@ -155,7 +155,8 @@ final class WebChatSwiftUIWindowController {
self.sessionKey = sessionKey
self.presentation = presentation
let vm = ClawdisChatViewModel(sessionKey: sessionKey, transport: transport)
self.hosting = NSHostingController(rootView: ClawdisChatView(viewModel: vm))
let accent = Self.color(fromHex: AppStateStore.shared.seamColorHex)
self.hosting = NSHostingController(rootView: ClawdisChatView(viewModel: vm, userAccent: accent))
self.contentController = Self.makeContentController(for: presentation, hosting: self.hosting)
self.window = Self.makeWindow(for: presentation, contentViewController: self.contentController)
}
@ -355,4 +356,15 @@ final class WebChatSwiftUIWindowController {
window.setFrame(frame, display: false)
}
}
private static func color(fromHex raw: String?) -> Color? {
let trimmed = (raw ?? "").trimmingCharacters(in: .whitespacesAndNewlines)
guard !trimmed.isEmpty else { return nil }
let hex = trimmed.hasPrefix("#") ? String(trimmed.dropFirst()) : trimmed
guard hex.count == 6, let value = Int(hex, radix: 16) else { return nil }
let r = Double((value >> 16) & 0xFF) / 255.0
let g = Double((value >> 8) & 0xFF) / 255.0
let b = Double(value & 0xFF) / 255.0
return Color(red: r, green: g, blue: b)
}
}

View File

@ -689,6 +689,23 @@ public struct ConfigSetParams: Codable {
}
}
public struct TalkModeParams: Codable {
public let enabled: Bool
public let phase: String?
public init(
enabled: Bool,
phase: String?
) {
self.enabled = enabled
self.phase = phase
}
private enum CodingKeys: String, CodingKey {
case enabled
case phase
}
}
public struct ProvidersStatusParams: Codable {
public let probe: Bool?
public let timeoutms: Int?

View File

@ -52,12 +52,17 @@ import Testing
try FileManager.default.setAttributes([.posixPermissions: 0o755], ofItemAtPath: nodePath.path)
try self.makeExec(at: scriptPath)
let cmd = CommandResolver.clawdisCommand(subcommand: "rpc", defaults: defaults)
let cmd = CommandResolver.clawdisCommand(
subcommand: "rpc",
defaults: defaults,
searchPaths: [tmp.appendingPathComponent("node_modules/.bin").path])
#expect(cmd.count >= 3)
#expect(cmd[0] == nodePath.path)
#expect(cmd[1] == scriptPath.path)
#expect(cmd[2] == "rpc")
if cmd.count >= 3 {
#expect(cmd[0] == nodePath.path)
#expect(cmd[1] == scriptPath.path)
#expect(cmd[2] == "rpc")
}
}
@Test func fallsBackToPnpm() async throws {

View File

@ -43,7 +43,8 @@ struct ConnectionsSettingsSmokeTests {
elapsedMs: 120,
bot: ProvidersStatusSnapshot.TelegramBot(id: 123, username: "clawdisbot"),
webhook: ProvidersStatusSnapshot.TelegramWebhook(url: "https://example.com/hook", hasCustomCert: false)),
lastProbeAt: 1_700_000_050_000))
lastProbeAt: 1_700_000_050_000),
discord: nil)
store.whatsappLoginMessage = "Scan QR"
store.whatsappLoginQrDataUrl =
@ -92,7 +93,8 @@ struct ConnectionsSettingsSmokeTests {
elapsedMs: 120,
bot: nil,
webhook: nil),
lastProbeAt: 1_700_000_100_000))
lastProbeAt: 1_700_000_100_000),
discord: nil)
let view = ConnectionsSettings(store: store)
_ = view.body

View File

@ -0,0 +1,97 @@
import Foundation
import Testing
@testable import Clawdis
@Suite(.serialized) struct TalkAudioPlayerTests {
@MainActor
@Test func playDoesNotHangWhenPlaybackEndsOrFails() async throws {
let wav = makeWav16Mono(sampleRate: 8000, samples: 80)
defer { _ = TalkAudioPlayer.shared.stop() }
_ = try await withTimeout(seconds: 2.0) {
await TalkAudioPlayer.shared.play(data: wav)
}
#expect(true)
}
@MainActor
@Test func playDoesNotHangWhenPlayIsCalledTwice() async throws {
let wav = makeWav16Mono(sampleRate: 8000, samples: 800)
defer { _ = TalkAudioPlayer.shared.stop() }
let first = Task { @MainActor in
await TalkAudioPlayer.shared.play(data: wav)
}
await Task.yield()
_ = await TalkAudioPlayer.shared.play(data: wav)
_ = try await withTimeout(seconds: 2.0) {
await first.value
}
#expect(true)
}
}
private struct TimeoutError: Error {}
private func withTimeout<T: Sendable>(
seconds: Double,
_ work: @escaping @Sendable () async throws -> T) async throws -> T
{
try await withThrowingTaskGroup(of: T.self) { group in
group.addTask {
try await work()
}
group.addTask {
try await Task.sleep(nanoseconds: UInt64(seconds * 1_000_000_000))
throw TimeoutError()
}
let result = try await group.next()
group.cancelAll()
guard let result else { throw TimeoutError() }
return result
}
}
private func makeWav16Mono(sampleRate: UInt32, samples: Int) -> Data {
let channels: UInt16 = 1
let bitsPerSample: UInt16 = 16
let blockAlign = channels * (bitsPerSample / 8)
let byteRate = sampleRate * UInt32(blockAlign)
let dataSize = UInt32(samples) * UInt32(blockAlign)
var data = Data()
data.append(contentsOf: [0x52, 0x49, 0x46, 0x46]) // RIFF
data.appendLEUInt32(36 + dataSize)
data.append(contentsOf: [0x57, 0x41, 0x56, 0x45]) // WAVE
data.append(contentsOf: [0x66, 0x6D, 0x74, 0x20]) // fmt
data.appendLEUInt32(16) // PCM
data.appendLEUInt16(1) // audioFormat
data.appendLEUInt16(channels)
data.appendLEUInt32(sampleRate)
data.appendLEUInt32(byteRate)
data.appendLEUInt16(blockAlign)
data.appendLEUInt16(bitsPerSample)
data.append(contentsOf: [0x64, 0x61, 0x74, 0x61]) // data
data.appendLEUInt32(dataSize)
// Silence samples.
data.append(Data(repeating: 0, count: Int(dataSize)))
return data
}
private extension Data {
mutating func appendLEUInt16(_ value: UInt16) {
var v = value.littleEndian
Swift.withUnsafeBytes(of: &v) { append(contentsOf: $0) }
}
mutating func appendLEUInt32(_ value: UInt32) {
var v = value.littleEndian
Swift.withUnsafeBytes(of: &v) { append(contentsOf: $0) }
}
}

View File

@ -12,10 +12,15 @@ let package = Package(
.library(name: "ClawdisKit", targets: ["ClawdisKit"]),
.library(name: "ClawdisChatUI", targets: ["ClawdisChatUI"]),
],
dependencies: [
.package(path: "../../../../ElevenLabsKit"),
],
targets: [
.target(
name: "ClawdisKit",
dependencies: [],
dependencies: [
.product(name: "ElevenLabsKit", package: "ElevenLabsKit"),
],
resources: [
.process("Resources"),
],

View File

@ -137,9 +137,10 @@ private struct ChatBubbleShape: InsettableShape {
struct ChatMessageBubble: View {
let message: ClawdisChatMessage
let style: ClawdisChatView.Style
let userAccent: Color?
var body: some View {
ChatMessageBody(message: self.message, isUser: self.isUser, style: self.style)
ChatMessageBody(message: self.message, isUser: self.isUser, style: self.style, userAccent: self.userAccent)
.frame(maxWidth: ChatUIConstants.bubbleMaxWidth, alignment: self.isUser ? .trailing : .leading)
.frame(maxWidth: .infinity, alignment: self.isUser ? .trailing : .leading)
.padding(.horizontal, 2)
@ -153,6 +154,7 @@ private struct ChatMessageBody: View {
let message: ClawdisChatMessage
let isUser: Bool
let style: ClawdisChatView.Style
let userAccent: Color?
var body: some View {
let text = self.primaryText
@ -287,7 +289,7 @@ private struct ChatMessageBody: View {
private var bubbleFillColor: Color {
if self.isUser {
return ClawdisChatTheme.userBubble
return self.userAccent ?? ClawdisChatTheme.userBubble
}
if self.style == .onboarding {
return ClawdisChatTheme.onboardingAssistantBubble

View File

@ -101,11 +101,7 @@ enum ClawdisChatTheme {
}
static var userBubble: Color {
#if os(macOS)
Color(nsColor: .systemBlue)
#else
Color(uiColor: .systemBlue)
#endif
Color(red: 127 / 255.0, green: 184 / 255.0, blue: 212 / 255.0)
}
static var assistantBubble: Color {

View File

@ -9,10 +9,12 @@ public struct ClawdisChatView: View {
@State private var viewModel: ClawdisChatViewModel
@State private var scrollerBottomID = UUID()
@State private var scrollPosition: UUID?
@State private var showSessions = false
@State private var hasPerformedInitialScroll = false
private let showsSessionSwitcher: Bool
private let style: Style
private let userAccent: Color?
private enum Layout {
#if os(macOS)
@ -37,11 +39,13 @@ public struct ClawdisChatView: View {
public init(
viewModel: ClawdisChatViewModel,
showsSessionSwitcher: Bool = false,
style: Style = .standard)
style: Style = .standard,
userAccent: Color? = nil)
{
self._viewModel = State(initialValue: viewModel)
self.showsSessionSwitcher = showsSessionSwitcher
self.style = style
self.userAccent = userAccent
}
public var body: some View {
@ -56,6 +60,7 @@ public struct ClawdisChatView: View {
.padding(.horizontal, Layout.outerPaddingHorizontal)
.padding(.vertical, Layout.outerPaddingVertical)
.frame(maxWidth: .infinity)
.frame(maxHeight: .infinity, alignment: .top)
}
.frame(maxWidth: .infinity, maxHeight: .infinity, alignment: .top)
.onAppear { self.viewModel.load() }
@ -69,68 +74,78 @@ public struct ClawdisChatView: View {
}
private var messageList: some View {
ScrollViewReader { proxy in
ZStack {
ScrollView {
LazyVStack(spacing: Layout.messageSpacing) {
ForEach(self.visibleMessages) { msg in
ChatMessageBubble(message: msg, style: self.style)
.frame(
maxWidth: .infinity,
alignment: msg.role.lowercased() == "user" ? .trailing : .leading)
}
if self.viewModel.pendingRunCount > 0 {
HStack {
ChatTypingIndicatorBubble(style: self.style)
.equatable()
Spacer(minLength: 0)
}
}
if !self.viewModel.pendingToolCalls.isEmpty {
ChatPendingToolsBubble(toolCalls: self.viewModel.pendingToolCalls)
.equatable()
.frame(maxWidth: .infinity, alignment: .leading)
}
if let text = self.viewModel.streamingAssistantText, !text.isEmpty {
ChatStreamingAssistantBubble(text: text)
.frame(maxWidth: .infinity, alignment: .leading)
}
Color.clear
.frame(height: Layout.messageListPaddingBottom + 1)
.id(self.scrollerBottomID)
}
.padding(.top, Layout.messageListPaddingTop)
.padding(.horizontal, Layout.messageListPaddingHorizontal)
}
if self.viewModel.isLoading {
ProgressView()
.controlSize(.large)
.frame(maxWidth: .infinity, maxHeight: .infinity)
ZStack {
ScrollView {
LazyVStack(spacing: Layout.messageSpacing) {
self.messageListRows
}
// Use scroll targets for stable auto-scroll without ScrollViewReader relayout glitches.
.scrollTargetLayout()
.padding(.top, Layout.messageListPaddingTop)
.padding(.horizontal, Layout.messageListPaddingHorizontal)
}
.onChange(of: self.viewModel.isLoading) { _, isLoading in
guard !isLoading, !self.hasPerformedInitialScroll else { return }
proxy.scrollTo(self.scrollerBottomID, anchor: .bottom)
self.hasPerformedInitialScroll = true
}
.onChange(of: self.viewModel.messages.count) { _, _ in
guard self.hasPerformedInitialScroll else { return }
withAnimation(.snappy(duration: 0.22)) {
proxy.scrollTo(self.scrollerBottomID, anchor: .bottom)
}
}
.onChange(of: self.viewModel.pendingRunCount) { _, _ in
guard self.hasPerformedInitialScroll else { return }
withAnimation(.snappy(duration: 0.22)) {
proxy.scrollTo(self.scrollerBottomID, anchor: .bottom)
}
// Keep the scroll pinned to the bottom for new messages.
.scrollPosition(id: self.$scrollPosition, anchor: .bottom)
if self.viewModel.isLoading {
ProgressView()
.controlSize(.large)
.frame(maxWidth: .infinity, maxHeight: .infinity)
}
}
// Ensure the message list claims vertical space on the first layout pass.
.frame(maxHeight: .infinity, alignment: .top)
.layoutPriority(1)
.onChange(of: self.viewModel.isLoading) { _, isLoading in
guard !isLoading, !self.hasPerformedInitialScroll else { return }
self.scrollPosition = self.scrollerBottomID
self.hasPerformedInitialScroll = true
}
.onChange(of: self.viewModel.messages.count) { _, _ in
guard self.hasPerformedInitialScroll else { return }
withAnimation(.snappy(duration: 0.22)) {
self.scrollPosition = self.scrollerBottomID
}
}
.onChange(of: self.viewModel.pendingRunCount) { _, _ in
guard self.hasPerformedInitialScroll else { return }
withAnimation(.snappy(duration: 0.22)) {
self.scrollPosition = self.scrollerBottomID
}
}
}
@ViewBuilder
private var messageListRows: some View {
ForEach(self.visibleMessages) { msg in
ChatMessageBubble(message: msg, style: self.style, userAccent: self.userAccent)
.frame(
maxWidth: .infinity,
alignment: msg.role.lowercased() == "user" ? .trailing : .leading)
}
if self.viewModel.pendingRunCount > 0 {
HStack {
ChatTypingIndicatorBubble(style: self.style)
.equatable()
Spacer(minLength: 0)
}
}
if !self.viewModel.pendingToolCalls.isEmpty {
ChatPendingToolsBubble(toolCalls: self.viewModel.pendingToolCalls)
.equatable()
.frame(maxWidth: .infinity, alignment: .leading)
}
if let text = self.viewModel.streamingAssistantText, !text.isEmpty {
ChatStreamingAssistantBubble(text: text)
.frame(maxWidth: .infinity, alignment: .leading)
}
Color.clear
.frame(height: Layout.messageListPaddingBottom + 1)
.id(self.scrollerBottomID)
}
private var visibleMessages: [ClawdisChatMessage] {

View File

@ -150,9 +150,36 @@ public final class ClawdisChatViewModel {
}
private static func decodeMessages(_ raw: [AnyCodable]) -> [ClawdisChatMessage] {
raw.compactMap { item in
let decoded = raw.compactMap { item in
(try? ChatPayloadDecoding.decode(item, as: ClawdisChatMessage.self))
}
return Self.dedupeMessages(decoded)
}
private static func dedupeMessages(_ messages: [ClawdisChatMessage]) -> [ClawdisChatMessage] {
var result: [ClawdisChatMessage] = []
result.reserveCapacity(messages.count)
var seen = Set<String>()
for message in messages {
guard let key = Self.dedupeKey(for: message) else {
result.append(message)
continue
}
if seen.contains(key) { continue }
seen.insert(key)
result.append(message)
}
return result
}
private static func dedupeKey(for message: ClawdisChatMessage) -> String? {
guard let timestamp = message.timestamp else { return nil }
let text = message.content.compactMap(\.text).joined(separator: "\n")
.trimmingCharacters(in: .whitespacesAndNewlines)
guard !text.isEmpty else { return nil }
return "\(message.role)|\(timestamp)|\(text)"
}
private func performSend() async {
@ -293,8 +320,17 @@ public final class ClawdisChatViewModel {
return
}
if let runId = chat.runId, !self.pendingRuns.contains(runId) {
// Ignore events for other runs.
let isOurRun = chat.runId.flatMap { self.pendingRuns.contains($0) } ?? false
if !isOurRun {
// Keep multiple clients in sync: if another client finishes a run for our session, refresh history.
switch chat.state {
case "final", "aborted", "error":
self.streamingAssistantText = nil
self.pendingToolCallsById = [:]
Task { await self.refreshHistoryAfterRun() }
default:
break
}
return
}

View File

@ -0,0 +1,16 @@
import Foundation
@MainActor
public protocol StreamingAudioPlaying {
func play(stream: AsyncThrowingStream<Data, Error>) async -> StreamingPlaybackResult
func stop() -> Double?
}
@MainActor
public protocol PCMStreamingAudioPlaying {
func play(stream: AsyncThrowingStream<Data, Error>, sampleRate: Double) async -> StreamingPlaybackResult
func stop() -> Double?
}
extension StreamingAudioPlayer: StreamingAudioPlaying {}
extension PCMStreamingAudioPlayer: PCMStreamingAudioPlaying {}

View File

@ -0,0 +1,9 @@
@_exported import ElevenLabsKit
public typealias ElevenLabsVoice = ElevenLabsKit.ElevenLabsVoice
public typealias ElevenLabsTTSRequest = ElevenLabsKit.ElevenLabsTTSRequest
public typealias ElevenLabsTTSClient = ElevenLabsKit.ElevenLabsTTSClient
public typealias TalkTTSValidation = ElevenLabsKit.TalkTTSValidation
public typealias StreamingAudioPlayer = ElevenLabsKit.StreamingAudioPlayer
public typealias PCMStreamingAudioPlayer = ElevenLabsKit.PCMStreamingAudioPlayer
public typealias StreamingPlaybackResult = ElevenLabsKit.StreamingPlaybackResult

View File

@ -7,6 +7,7 @@ public enum JPEGTranscodeError: LocalizedError, Sendable {
case decodeFailed
case propertiesMissing
case encodeFailed
case sizeLimitExceeded(maxBytes: Int, actualBytes: Int)
public var errorDescription: String? {
switch self {
@ -16,6 +17,8 @@ public enum JPEGTranscodeError: LocalizedError, Sendable {
"Failed to read image properties"
case .encodeFailed:
"Failed to encode JPEG"
case let .sizeLimitExceeded(maxBytes, actualBytes):
"JPEG exceeds size limit (\(actualBytes) bytes > \(maxBytes) bytes)"
}
}
}
@ -32,7 +35,8 @@ public struct JPEGTranscoder: Sendable {
public static func transcodeToJPEG(
imageData: Data,
maxWidthPx: Int?,
quality: Double) throws -> (data: Data, widthPx: Int, heightPx: Int)
quality: Double,
maxBytes: Int? = nil) throws -> (data: Data, widthPx: Int, heightPx: Int)
{
guard let src = CGImageSourceCreateWithData(imageData as CFData, nil) else {
throw JPEGTranscodeError.decodeFailed
@ -58,7 +62,7 @@ public struct JPEGTranscoder: Sendable {
let orientedHeight = rotates90 ? pixelWidth : pixelHeight
let maxDim = max(orientedWidth, orientedHeight)
let targetMaxPixelSize: Int = {
var targetMaxPixelSize: Int = {
guard let maxWidthPx, maxWidthPx > 0 else { return maxDim }
guard orientedWidth > maxWidthPx else { return maxDim } // never upscale
@ -66,28 +70,66 @@ public struct JPEGTranscoder: Sendable {
return max(1, Int((Double(maxDim) * scale).rounded(.toNearestOrAwayFromZero)))
}()
let thumbOpts: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: targetMaxPixelSize,
kCGImageSourceShouldCacheImmediately: true,
]
func encode(maxPixelSize: Int, quality: Double) throws -> (data: Data, widthPx: Int, heightPx: Int) {
let thumbOpts: [CFString: Any] = [
kCGImageSourceCreateThumbnailFromImageAlways: true,
kCGImageSourceCreateThumbnailWithTransform: true,
kCGImageSourceThumbnailMaxPixelSize: maxPixelSize,
kCGImageSourceShouldCacheImmediately: true,
]
guard let img = CGImageSourceCreateThumbnailAtIndex(src, 0, thumbOpts as CFDictionary) else {
throw JPEGTranscodeError.decodeFailed
guard let img = CGImageSourceCreateThumbnailAtIndex(src, 0, thumbOpts as CFDictionary) else {
throw JPEGTranscodeError.decodeFailed
}
let out = NSMutableData()
guard let dest = CGImageDestinationCreateWithData(out, UTType.jpeg.identifier as CFString, 1, nil) else {
throw JPEGTranscodeError.encodeFailed
}
let q = self.clampQuality(quality)
let encodeProps = [kCGImageDestinationLossyCompressionQuality: q] as CFDictionary
CGImageDestinationAddImage(dest, img, encodeProps)
guard CGImageDestinationFinalize(dest) else {
throw JPEGTranscodeError.encodeFailed
}
return (out as Data, img.width, img.height)
}
let out = NSMutableData()
guard let dest = CGImageDestinationCreateWithData(out, UTType.jpeg.identifier as CFString, 1, nil) else {
throw JPEGTranscodeError.encodeFailed
}
let q = self.clampQuality(quality)
let encodeProps = [kCGImageDestinationLossyCompressionQuality: q] as CFDictionary
CGImageDestinationAddImage(dest, img, encodeProps)
guard CGImageDestinationFinalize(dest) else {
throw JPEGTranscodeError.encodeFailed
guard let maxBytes, maxBytes > 0 else {
return try encode(maxPixelSize: targetMaxPixelSize, quality: quality)
}
return (out as Data, img.width, img.height)
let minQuality = max(0.2, self.clampQuality(quality) * 0.35)
let minPixelSize = 256
var best = try encode(maxPixelSize: targetMaxPixelSize, quality: quality)
if best.data.count <= maxBytes {
return best
}
for _ in 0..<6 {
var q = self.clampQuality(quality)
for _ in 0..<6 {
let candidate = try encode(maxPixelSize: targetMaxPixelSize, quality: q)
best = candidate
if candidate.data.count <= maxBytes {
return candidate
}
if q <= minQuality { break }
q = max(minQuality, q * 0.75)
}
let nextPixelSize = max(Int(Double(targetMaxPixelSize) * 0.85), minPixelSize)
if nextPixelSize == targetMaxPixelSize {
break
}
targetMaxPixelSize = nextPixelSize
}
if best.data.count > maxBytes {
throw JPEGTranscodeError.sizeLimitExceeded(maxBytes: maxBytes, actualBytes: best.data.count)
}
return best
}
}

View File

@ -4,6 +4,21 @@
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover" />
<title>Canvas</title>
<script>
(() => {
try {
const params = new URLSearchParams(window.location.search);
const platform = (params.get('platform') || '').trim().toLowerCase();
if (platform) {
document.documentElement.dataset.platform = platform;
return;
}
if (/android/i.test(navigator.userAgent || '')) {
document.documentElement.dataset.platform = 'android';
}
} catch (_) {}
})();
</script>
<style>
:root { color-scheme: dark; }
@media (prefers-reduced-motion: reduce) {
@ -18,6 +33,13 @@
#000;
overflow: hidden;
}
:root[data-platform="android"] body {
background:
radial-gradient(1200px 900px at 15% 20%, rgba(42, 113, 255, 0.62), rgba(0,0,0,0) 55%),
radial-gradient(900px 700px at 85% 30%, rgba(255, 0, 138, 0.52), rgba(0,0,0,0) 60%),
radial-gradient(1000px 900px at 60% 90%, rgba(0, 209, 255, 0.48), rgba(0,0,0,0) 60%),
#0b1328;
}
body::before {
content:"";
position: fixed;
@ -35,6 +57,7 @@
pointer-events: none;
animation: clawdis-grid-drift 140s ease-in-out infinite alternate;
}
:root[data-platform="android"] body::before { opacity: 0.80; }
body::after {
content:"";
position: fixed;
@ -52,6 +75,7 @@
pointer-events: none;
animation: clawdis-glow-drift 110s ease-in-out infinite alternate;
}
:root[data-platform="android"] body::after { opacity: 0.85; }
@supports (mix-blend-mode: screen) {
body::after { mix-blend-mode: screen; }
}
@ -77,6 +101,13 @@
touch-action: none;
z-index: 1;
}
:root[data-platform="android"] #clawdis-canvas {
background:
radial-gradient(1100px 800px at 20% 15%, rgba(42, 113, 255, 0.78), rgba(0,0,0,0) 58%),
radial-gradient(900px 650px at 82% 28%, rgba(255, 0, 138, 0.66), rgba(0,0,0,0) 62%),
radial-gradient(1000px 900px at 60% 88%, rgba(0, 209, 255, 0.58), rgba(0,0,0,0) 62%),
#141c33;
}
#clawdis-status {
position: fixed;
inset: 0;

Some files were not shown because too many files have changed in this diff Show More