Back to overview
Architecture

From outbound connection to direct peer-to-peer session.

The short version is on the landing page. This is the long one — how the agent connects, how the browser authenticates, how a direct link is opened, and what flows on top of it.

Connection flow

Host boot → Signaling → SRP authentication → Encrypted data flow → WebRTC establishment

Loading animation...

What happens, step by step

Each piece of the system has a narrow job. The whole only works because every part keeps to its role.

Agent on your device

The agent runs as your user on Linux or macOS. It opens an outbound, TLS-protected WebSocket to the relay. Nothing listens on your network. When a browser asks for a session, the agent spawns a PTY (an interactive shell) bound to that browser.

Relay = matchmaker, not middleman

The relay pairs your agent and your browser by their identifiers and relays signaling messages. After authentication and key derivation, the relay either forwards opaque encrypted bytes or steps out of the path entirely once a direct connection is established.

Browser client

The web client lives in your workspace. It speaks the same binary protocol as the native CLI, handles a full xterm.js terminal, file manager, and code editor. No browser extension or native helper is needed.

End-to-end encrypted traffic

After SRP-6a authentication, both sides derive an AES-256-SIV key via HKDF-SHA256. From that point on, every terminal byte, every file chunk, and every editor action is encrypted between the agent and the browser — independent of the transport in use.

WebRTC peer-to-peer

In parallel with the active session, the browser initiates a WebRTC negotiation through the relay. Once the data channel opens, traffic flows directly between the two peers — the relay is no longer in the data path.

Three-layer fallback

If direct P2P can't get through — symmetric NAT, strict firewall rules — traffic moves to a TURN-assisted WebRTC path. If even that's blocked, the session keeps running over the WebSocket relay. End-to-end encryption is preserved at every layer; sessions stay reliable across any network.

Three components, one protocol

The host, the relay, and the browser each own a narrow surface. They all speak the same wire protocol so they can move data cleanly between transports.

Host (the agent)

  • Opens an outbound WSS connection to the relay
  • Performs SRP-6a authentication with the browser
  • Spawns a PTY with an interactive shell
  • Acts as WebRTC answerer when the browser initiates P2P
  • Sends and receives chunked file transfers

Relay / signaling server

  • Accepts authenticated WSS connections from agents and browsers
  • Pairs them by device identifier
  • Relays SRP messages, Encrypted WebRTC offer / answer / ICE candidates
  • Forwards opaque encrypted bytes only when P2P is not available
  • Cannot decrypt session payloads — keys are never exchanged with it

Browser client

  • Lives inside your XShell workspace at /workspace
  • Initiates SRP-6a authentication and WebRTC negotiation
  • Renders the terminal, file manager, and code editor
  • Falls back to the relay if WebRTC negotiation fails
  • Logs P2P state so you can see whether the session is direct

The protocol underneath

The same binary format carries everything: authentication, shell I/O, file transfers, WebRTC signaling, control commands.

Binary wire protocol

A length-prefixed JSON header followed by an optional binary payload. Each message has a type (auth, data, file, cmd, webrtc, encrypted, ...) and travels over the same connection — whether that connection is WebSocket or a WebRTC data channel.

Multiplexed sessions

Multiple concurrent PTY sessions share a single transport connection. The signaling server assigns each browser a session identifier, and the agent routes encrypted traffic to the correct PTY based on it. No extra sockets, no extra overhead.

Chunked file transfer

Files move in 64 KB chunks — small enough to be safe on WebRTC data channels and WebSocket frames. A running SHA-256 hash is built during transfer and verified on the receiver before the file is committed.

Continuity counter

Every encrypted packet carries a monotonic counter, strictly increasing per session and per direction. Replays, reorders, and tampering are detected and rejected before the payload is processed.

Three-layer connection fallback

Sessions stay reliable across any network by falling back through three layers in order. End-to-end encryption is preserved at every layer — the relay never sees plaintext.

1

Direct P2P

Preferred · Free

  • Browser and agent connect directly via WebRTC
  • Latency drops to the round-trip time between you and your device
  • Relay drops out of the data path entirely
  • Always free — never counted against any quota
2

TURN-assisted

Pro · Counts toward quota

  • For symmetric NATs and strict firewall rules that block direct P2P
  • Still a WebRTC data channel, relayed through a TURN server
  • Pro plans include 50 GB / month of TURN traffic
  • Pro users can configure their own TURN server in settings to opt out of our quota
3

WSS relay

Last resort · Durability guarantee

  • Used when WebRTC cannot be established at all
  • Session keeps running over a plain WebSocket relay
  • End-to-end encryption preserved — relay still cannot decrypt
  • The UI shows whether the active session is direct, TURN-assisted, or relayed

Get the agent running

One command on the machine you want to reach. Pair it from your browser. You're done.