Recording Editor Export Captions Camera Backgrounds Cursor & Zoom Camera Animations

Internals

Cursor metadata & zoom

Cursor position is recorded at 120 Hz, separate from video, and stored as normalized coordinates. At export, this data drives cursor rendering, click highlights, and zoom, whether you place keyframes by hand or let auto-detection do it.

Sampling

120 Hz cursor capture

Captured on its own timer thread, separate from the video pipeline. Cursor accuracy isn't affected by frame drops or encoding load.

8 ms Timer Resolution

A DispatchSourceTimer fires every 8 milliseconds, yielding a nominal 125 Hz sampling rate. Each tick calls NSEvent.mouseLocation for the current cursor position.

Normalized Coordinates

Screen-space coordinates are converted to normalized values (0.0–1.0) relative to the capture area on every sample. ScreenCaptureKit's top-left Y origin is applied at this point.

Button State per Sample

NSEvent.pressedMouseButtons is read alongside the position on every tick. Stored as a boolean per sample for press-hold tracking.

Pause-aware Timestamps

Timestamps are computed using CMClockGetHostTimeClock relative to the recording start. When the recording is paused, the accumulated pause duration is subtracted so timestamps remain contiguous in the final file.

Click Events (Separate Stream)

Mouse clicks are recorded as a separate event list alongside the position samples. Each click event stores its timestamp, normalized position, and the button identifier (left, right, middle).

Keystroke Events

Keyboard events are logged with timestamp, key code, modifier flags (Shift, Cmd, Option, Ctrl), and key-down/up state. Available for future overlays.

Data stored per sample

CursorSample {
  t: Double        // seconds since recording start
  x: Double        // normalized X  (0.0 = left edge of capture area)
  y: Double        // normalized Y  (0.0 = top  edge of capture area)
  p: Bool          // any mouse button currently pressed
}

CursorClickEvent {
  t: Double        // timestamp
  x, y: Double     // normalized position
  button: Int      // 0 = left, 1 = right, 2 = middle
}

Coordinate transform (per sample)

// AppKit origin is bottom-left; SCKit is top-left
sckY  = displayHeight - mouseY

// Translate into capture-area space
rawX  = mouseX - captureOrigin.x
rawY  = sckY   - captureOrigin.y

// Normalize to [0, 1]
nx    = rawX / captureWidth
ny    = rawY / captureHeight

Smoothing

Spring physics smoothing

Raw cursor paths from a recording tend to have micro-jitter from hand tremor and sloppy mousing. The smoother attaches a virtual mass to the real cursor via a spring, so the rendered path trails behind with natural-looking deceleration.

Spring Physics Model

Each smoothed position is computed by simulating a spring-damper system. At each 1 ms sub-step the spring force (tension × displacement) and damping force (friction × velocity) are applied to the mass, updating velocity and position.

1 ms Sub-stepping

The 8 ms gap between 120 Hz samples is subdivided into 1 ms integration steps. This gives the spring simulation enough resolution to produce smooth acceleration and deceleration curves between samples.

Four Speed Presets

Slow (tension 80, friction 20, mass 3.0) produces deliberate lag. Medium (170 / 26 / 1.5) is the default. Fast (300 / 34 / 1.0) and Rapid (500 / 44 / 0.6) track the raw cursor more closely while still removing jitter.

Output at Original Timestamps

Output is one sample per input sample at the same timestamps, not the 1 ms sub-steps. Keeps the metadata file compact while the physics run at full resolution internally.

Physics loop (1 ms steps)

for each 1 ms step:
  accelX = (tension × (targetX − posX)
          − friction × velX) / mass
  accelY = (tension × (targetY − posY)
          − friction × velY) / mass

  velX  += accelX × 0.001
  velY  += accelY × 0.001
  posX  += velX   × 0.001
  posY  += velY   × 0.001

Preset parameters

Preset   tension  friction  mass
──────── ──────── ──────── ──────
Slow          80        20     3.0
Medium       170        26     1.5   (default)
Fast         300        34     1.0
Rapid        500        44     0.6

Rendering

Cursor overlay at export

During export, CursorMetadataProvider is queried once per frame. The compositor uses the interpolated position and click state to render the cursor directly onto the composited frame in the correct pixel space.

Linear Interpolation at Export

CursorMetadataProvider uses binary search to find the two samples that bracket each video frame's composition timestamp, then linearly interpolates X and Y to get a sub-sample-accurate position.

Zoom-compensated Cursor Size

The cursor is rendered at a size that accounts for the current zoom level. As the video zooms in, the cursor appears at a consistent perceived size rather than scaling up with the content.

Click Highlight Animation

Active clicks within a 0.4 s window of each frame are rendered as expanding rings. The ring starts at 50% of the configured diameter and expands to 200%, with opacity fading from full to zero over the 0.4 s duration.

Center vs. Hotspot Anchoring

Geometric styles (circles, rings, crosshairs) are center-anchored on the cursor position. Arrow and pointer styles use the top-left hotspot to match the visual click target of the real cursor.

Zoom

Zoom keyframes

Zoom is implemented as a time-varying viewport rect over the normalized (0.0–1.0) video coordinate space. ZoomTimeline converts keyframe data into a visible rect at any given timestamp, which the compositor uses to crop and scale the source video.

Four Keyframes per Region

Each zoom region is defined by four timestamps: zoom-in start, zoom-in end (full zoom), zoom-out start (still full zoom), and zoom-out end. The gap between zoom-in end and zoom-out start is the hold duration at full magnification.

Smoothstep Easing

Transitions between zoom levels use smoothstep — t³(6t²−15t+10) — which produces an ease-in-out curve with zero velocity at both endpoints. This avoids the abrupt start/stop of linear interpolation.

Inverse Zoom Interpolation

Zoom level is interpolated in the inverse domain (1/zoom) rather than linearly. This matches human perception: going from 1× to 2× feels like the same visual change as going from 2× to 4×.

Normalized Pan Center

The visible region's center is stored as normalized (0.0–1.0) coordinates. The compositor clamps the derived rect to keep it inside the video bounds, preventing blank edges at the frame perimeter.

ZoomKeyframe structure

ZoomKeyframe {
  t:          Double   // time in seconds
  zoomLevel:  Double   // 1.0 = no zoom, 2.0 = 2× etc.
  centerX:    Double   // pan center, normalized 0–1
  centerY:    Double   // pan center, normalized 0–1
  isAuto:     Bool     // generated by ZoomDetector
}

Viewport rect from zoom level

// Inverse zoom interpolation + smoothstep
invA   = 1.0 / zoomA
invB   = 1.0 / zoomB
t      = smoothstep(elapsed / duration)
zoom   = 1.0 / (invA + (invB − invA) × t)

// Compute visible rect
w      = 1.0 / zoom          // fraction of source visible
h      = 1.0 / zoom
originX = clamp(cx − w/2, 0, 1−w)
originY = clamp(cy − h/2, 0, 1−h)

Manual zoom

Manual zoom mode

In manual mode you create and shape every zoom region by hand on the timeline. Regions are draggable, resizable, and editable via a popover. Manual and auto are mutually exclusive; switching clears the other.

Double-click to Add

Double-click empty space on the zoom track to add a region. Created as a manual region (isAuto: false), so it's immediately interactive.

Drag to Move

Dragging the body of a region shifts all four of its keyframes by the same time delta, preserving the ease-in and ease-out durations. Regions cannot be dragged past their neighbours.

Edge Drag to Reshape

Dragging the left edge adjusts the ease-in duration while the hold and ease-out stay fixed. Dragging the right edge adjusts the ease-out duration while the ease-in and hold stay fixed. The cursor changes to a resize arrow when hovering within 8 px of either edge.

Edit Popover

Right-clicking a region opens a popover with three sliders: zoom level (1.1×–5.0× in 0.1× steps), ease-in duration (0.05–2.0 s), and ease-out duration (0.05–2.0 s). Changes apply live to the preview. A delete button removes the region entirely.

What the edit popover controls

Zoom Level   1.1× – 5.0×   step 0.1×   (peak magnification of the region)
Ease In      0.05 – 2.0 s  step 0.05s  (zoom-in transition duration)
Ease Out     0.05 – 2.0 s  step 0.05s  (zoom-out transition duration)
─────────────────────────────────────────────────────────────────
Hold = region span − easeIn − easeOut   (clamped to minimum 0.01 s)
Pan center = set at creation time; not editable in the popover

Auto-zoom

Automatic zoom detection

ZoomDetector analyses the click event list from cursor metadata to generate zoom keyframes automatically. It groups nearby clicks into activity clusters and creates a zoom-in/hold/zoom-out region centered on each cluster.

Click Clustering

Clicks are grouped into clusters by scanning the click event list and starting a new cluster whenever the gap between two consecutive clicks exceeds the dwell threshold (default 0.5 s). Each cluster becomes one zoom region.

Cluster Center as Pan Target

The zoom center for each cluster is the arithmetic mean of all click X and Y coordinates within the cluster. This places the zoomed viewport over the centroid of activity rather than any individual click.

Transition Duration (0.4 s default)

The ease-in and ease-out transition durations are taken from ZoomDetectorConfig. The zoom keyframes are placed so the transition starts just before the first click and ends just after the last, keeping the transition out of the hold phase.

Mutually Exclusive Modes

Auto and manual zoom are mutually exclusive. Auto-generated keyframes are read-only in the timeline and can't be moved, resized, or deleted. Switching to manual mode clears them via clearAutoKeyframes().

Keyframe placement for one cluster

cluster = clicks grouped by gap < dwellThreshold (0.5 s)

zoomInTime   = cluster.startTime − transitionDuration   // ease-in begins
zoomStartTime= cluster.startTime                        // full zoom reached
zoomEndTime  = max(cluster.endTime, start + holdMin)    // hold ends
zoomOutTime  = zoomEndTime + transitionDuration         // ease-out complete

center = mean(click.x), mean(click.y)   // centroid of all clicks