Cursor metadata & zoom
Cursor position is recorded at 120 Hz, separate from video, and stored as normalized coordinates. At export, this data drives cursor rendering, click highlights, and zoom, whether you place keyframes by hand or let auto-detection do it.
120 Hz cursor capture
Captured on its own timer thread, separate from the video pipeline. Cursor accuracy isn't affected by frame drops or encoding load.
8 ms Timer Resolution
A DispatchSourceTimer fires every 8 milliseconds, yielding a nominal 125 Hz sampling rate. Each tick calls NSEvent.mouseLocation for the current cursor position.
Normalized Coordinates
Screen-space coordinates are converted to normalized values (0.0–1.0) relative to the capture area on every sample. ScreenCaptureKit's top-left Y origin is applied at this point.
Button State per Sample
NSEvent.pressedMouseButtons is read alongside the position on every tick. Stored as a boolean per sample for press-hold tracking.
Pause-aware Timestamps
Timestamps are computed using CMClockGetHostTimeClock relative to the recording start. When the recording is paused, the accumulated pause duration is subtracted so timestamps remain contiguous in the final file.
Click Events (Separate Stream)
Mouse clicks are recorded as a separate event list alongside the position samples. Each click event stores its timestamp, normalized position, and the button identifier (left, right, middle).
Keystroke Events
Keyboard events are logged with timestamp, key code, modifier flags (Shift, Cmd, Option, Ctrl), and key-down/up state. Available for future overlays.
Data stored per sample
CursorSample {
t: Double // seconds since recording start
x: Double // normalized X (0.0 = left edge of capture area)
y: Double // normalized Y (0.0 = top edge of capture area)
p: Bool // any mouse button currently pressed
}
CursorClickEvent {
t: Double // timestamp
x, y: Double // normalized position
button: Int // 0 = left, 1 = right, 2 = middle
}Coordinate transform (per sample)
// AppKit origin is bottom-left; SCKit is top-left sckY = displayHeight - mouseY // Translate into capture-area space rawX = mouseX - captureOrigin.x rawY = sckY - captureOrigin.y // Normalize to [0, 1] nx = rawX / captureWidth ny = rawY / captureHeight
Spring physics smoothing
Raw cursor paths from a recording tend to have micro-jitter from hand tremor and sloppy mousing. The smoother attaches a virtual mass to the real cursor via a spring, so the rendered path trails behind with natural-looking deceleration.
Spring Physics Model
Each smoothed position is computed by simulating a spring-damper system. At each 1 ms sub-step the spring force (tension × displacement) and damping force (friction × velocity) are applied to the mass, updating velocity and position.
1 ms Sub-stepping
The 8 ms gap between 120 Hz samples is subdivided into 1 ms integration steps. This gives the spring simulation enough resolution to produce smooth acceleration and deceleration curves between samples.
Four Speed Presets
Slow (tension 80, friction 20, mass 3.0) produces deliberate lag. Medium (170 / 26 / 1.5) is the default. Fast (300 / 34 / 1.0) and Rapid (500 / 44 / 0.6) track the raw cursor more closely while still removing jitter.
Output at Original Timestamps
Output is one sample per input sample at the same timestamps, not the 1 ms sub-steps. Keeps the metadata file compact while the physics run at full resolution internally.
Physics loop (1 ms steps)
for each 1 ms step:
accelX = (tension × (targetX − posX)
− friction × velX) / mass
accelY = (tension × (targetY − posY)
− friction × velY) / mass
velX += accelX × 0.001
velY += accelY × 0.001
posX += velX × 0.001
posY += velY × 0.001Preset parameters
Preset tension friction mass ──────── ──────── ──────── ────── Slow 80 20 3.0 Medium 170 26 1.5 (default) Fast 300 34 1.0 Rapid 500 44 0.6
Cursor overlay at export
During export, CursorMetadataProvider is queried once per frame. The compositor uses the interpolated position and click state to render the cursor directly onto the composited frame in the correct pixel space.
Linear Interpolation at Export
CursorMetadataProvider uses binary search to find the two samples that bracket each video frame's composition timestamp, then linearly interpolates X and Y to get a sub-sample-accurate position.
Zoom-compensated Cursor Size
The cursor is rendered at a size that accounts for the current zoom level. As the video zooms in, the cursor appears at a consistent perceived size rather than scaling up with the content.
Click Highlight Animation
Active clicks within a 0.4 s window of each frame are rendered as expanding rings. The ring starts at 50% of the configured diameter and expands to 200%, with opacity fading from full to zero over the 0.4 s duration.
Center vs. Hotspot Anchoring
Geometric styles (circles, rings, crosshairs) are center-anchored on the cursor position. Arrow and pointer styles use the top-left hotspot to match the visual click target of the real cursor.
Zoom keyframes
Zoom is implemented as a time-varying viewport rect over the normalized (0.0–1.0) video coordinate space. ZoomTimeline converts keyframe data into a visible rect at any given timestamp, which the compositor uses to crop and scale the source video.
Four Keyframes per Region
Each zoom region is defined by four timestamps: zoom-in start, zoom-in end (full zoom), zoom-out start (still full zoom), and zoom-out end. The gap between zoom-in end and zoom-out start is the hold duration at full magnification.
Smoothstep Easing
Transitions between zoom levels use smoothstep — t³(6t²−15t+10) — which produces an ease-in-out curve with zero velocity at both endpoints. This avoids the abrupt start/stop of linear interpolation.
Inverse Zoom Interpolation
Zoom level is interpolated in the inverse domain (1/zoom) rather than linearly. This matches human perception: going from 1× to 2× feels like the same visual change as going from 2× to 4×.
Normalized Pan Center
The visible region's center is stored as normalized (0.0–1.0) coordinates. The compositor clamps the derived rect to keep it inside the video bounds, preventing blank edges at the frame perimeter.
ZoomKeyframe structure
ZoomKeyframe {
t: Double // time in seconds
zoomLevel: Double // 1.0 = no zoom, 2.0 = 2× etc.
centerX: Double // pan center, normalized 0–1
centerY: Double // pan center, normalized 0–1
isAuto: Bool // generated by ZoomDetector
}Viewport rect from zoom level
// Inverse zoom interpolation + smoothstep invA = 1.0 / zoomA invB = 1.0 / zoomB t = smoothstep(elapsed / duration) zoom = 1.0 / (invA + (invB − invA) × t) // Compute visible rect w = 1.0 / zoom // fraction of source visible h = 1.0 / zoom originX = clamp(cx − w/2, 0, 1−w) originY = clamp(cy − h/2, 0, 1−h)
Manual zoom mode
In manual mode you create and shape every zoom region by hand on the timeline. Regions are draggable, resizable, and editable via a popover. Manual and auto are mutually exclusive; switching clears the other.
Double-click to Add
Double-click empty space on the zoom track to add a region. Created as a manual region (isAuto: false), so it's immediately interactive.
Drag to Move
Dragging the body of a region shifts all four of its keyframes by the same time delta, preserving the ease-in and ease-out durations. Regions cannot be dragged past their neighbours.
Edge Drag to Reshape
Dragging the left edge adjusts the ease-in duration while the hold and ease-out stay fixed. Dragging the right edge adjusts the ease-out duration while the ease-in and hold stay fixed. The cursor changes to a resize arrow when hovering within 8 px of either edge.
Edit Popover
Right-clicking a region opens a popover with three sliders: zoom level (1.1×–5.0× in 0.1× steps), ease-in duration (0.05–2.0 s), and ease-out duration (0.05–2.0 s). Changes apply live to the preview. A delete button removes the region entirely.
What the edit popover controls
Zoom Level 1.1× – 5.0× step 0.1× (peak magnification of the region) Ease In 0.05 – 2.0 s step 0.05s (zoom-in transition duration) Ease Out 0.05 – 2.0 s step 0.05s (zoom-out transition duration) ───────────────────────────────────────────────────────────────── Hold = region span − easeIn − easeOut (clamped to minimum 0.01 s) Pan center = set at creation time; not editable in the popover
Automatic zoom detection
ZoomDetector analyses the click event list from cursor metadata to generate zoom keyframes automatically. It groups nearby clicks into activity clusters and creates a zoom-in/hold/zoom-out region centered on each cluster.
Click Clustering
Clicks are grouped into clusters by scanning the click event list and starting a new cluster whenever the gap between two consecutive clicks exceeds the dwell threshold (default 0.5 s). Each cluster becomes one zoom region.
Cluster Center as Pan Target
The zoom center for each cluster is the arithmetic mean of all click X and Y coordinates within the cluster. This places the zoomed viewport over the centroid of activity rather than any individual click.
Transition Duration (0.4 s default)
The ease-in and ease-out transition durations are taken from ZoomDetectorConfig. The zoom keyframes are placed so the transition starts just before the first click and ends just after the last, keeping the transition out of the hold phase.
Mutually Exclusive Modes
Auto and manual zoom are mutually exclusive. Auto-generated keyframes are read-only in the timeline and can't be moved, resized, or deleted. Switching to manual mode clears them via clearAutoKeyframes().
Keyframe placement for one cluster
cluster = clicks grouped by gap < dwellThreshold (0.5 s) zoomInTime = cluster.startTime − transitionDuration // ease-in begins zoomStartTime= cluster.startTime // full zoom reached zoomEndTime = max(cluster.endTime, start + holdMin) // hold ends zoomOutTime = zoomEndTime + transitionDuration // ease-out complete center = mean(click.x), mean(click.y) // centroid of all clicks