If the Inventory module is the robot's 'memory,' then the Scout module is its 'eyes.' In the turbulence of tens of thousands of state changes per second generated by Solana, Scout's task is to rapidly sift through, filter, and decode signals that are truly meaningful for arbitrage strategies.

In the world of MEV, speed is not everything, but without speed, there is nothing. This article will delve into how to build a low-latency, high-concurrency trading listening and parsing system.

1. Listening Philosophy: Scalpel vs. Big Fishing Net

On Solana, we typically face two distinctly different listening needs, corresponding to different technical paths:

1.1 accountSubscribe: A precise scalpel (Arb mode)

For cross-protocol arbitrage, we have locked specific pools through Inventory. At this point, we do not need to monitor the entire network, but rather closely watch the changes in the Data field of these pool accounts.

  • Mechanism: Once the token balance or price in the pool changes, the RPC node will immediately push the latest account data.

  • Advantage: The signal is extremely direct, skipping cumbersome transaction parsing, making it the fastest path for high-frequency arbitrage.

1.2 logsSubscribe: A large fishing net covering the entire network (Sniper mode)

For sniping new pools, we cannot predict the address of the pool and can only capture instructions for 'new pool' or 'initial liquidity injection' by listening to specific protocols (such as Raydium or Orca) through their Program Logs.

  • Mechanism: Scan for specific keywords in logs (e.g., initialize2).

  • Challenge: The noise is extremely high, and once hit, it usually requires a 'slow path' process (such as requesting getTransaction) to supplement the parsing of the pool's token information.

2. Core architecture: Stream Multiplexing

In a mature system, you may need to subscribe to updates from hundreds of pools simultaneously. If you open a thread for each subscription, system overhead will explode instantly.

2.1 Asynchronous stream merging (Select All)

We adopt Rust's asynchronous ecosystem (Tokio + Futures), using select_all to merge hundreds or thousands of WebSocket subscription streams into a single event stream. This is like consolidating the views of hundreds of surveillance cameras into a display wall, processed by a single core loop (Event Loop).

2.2 Thread model and 'slow path' decoupling

The response speed of the main loop determines the upper limit of system latency.

  • Fast path (Hot Path): Receive data -> Memory decoding -> Trigger computation.

  • Slow path (Long Path): If it is necessary to request additional RPC completion information (such as Sniper mode), it must be immediately stripped to background task execution using tokio::spawn, and blocking the main loop is strictly prohibited.

3. Extreme parsing: Skip useless information

Solana's account data (Account Data) is usually a binary Buffer. An inefficient approach is to deserialize it into a complete object, while an extreme approach is 'on-demand parsing.'

3.1 Zero-copy and offset positioning

For example, when listening to Orca Whirlpool, we may only need the sqrt_price and tick_current_index.

  • We do not need to parse the entire pool state (hundreds of bytes), we only need to read the 16 bytes directly at a specific Offset in the data stream.

  • In Rust, by combining bytemuck or simple pointer offsets, we can extract key pricing parameters at the microsecond level.

3.2 The art of filters

In the logsSubscribe phase, using the mentions filter provided by RPC, we can filter out 90% of irrelevant logs on the node side, greatly reducing the network IO pressure on the Searcher side.

4. Performance optimization points: Achieve milliseconds from engineering implementation

  1. Sharded subscriptions: In response to connection limits on public RPC nodes, Scout will automatically shard whitelisted pools and concurrently receive updates through multiple WebSocket connections to avoid backpressure from a single connection.

  2. Noise reduction mechanism: For pools with high-frequency changes, implement a simple packet drop or merging logic (Coalescing). If multiple updates occur for the same pool within 1ms, only process the last state to save computational resources in the strategy layer.

  3. Prefetch index: When parsing logs, preload the Decimals information of commonly used tokens to avoid generating secondary requests when calculating price differences.

5. Technical demonstration: Multi-path event stream merging logic (Python simulation)

Although the high-performance core is in Rust, its 'many-to-one' merging and distribution logic can be perfectly expressed with asyncio:

import asyncio
import random

async def pool_monitor(pool_id: str):
"""Simulate a subscription stream for an independent account"""
while True:
await asyncio.sleep(random.uniform(0.01, 0.1)) # Simulate random push
yield {"pool": pool_id, "data": random.random()}

async def main_scout_loop():
# Simulate the listening list obtained from Inventory
watchlist = ["Pool_A", "Pool_B", "Pool_C"]

# Aggregate all streams into a single queue
queue = asyncio.Queue()

async def producer(pool_id):
async for update in pool_monitor(pool_id):
await queue.put(update)

# Start all producer tasks
for p in watchlist:
asyncio.create_task(producer(p))

print("[*] Scout engine has started, listening for multiple signals...")

# Core consumption loop: strategy distribution processing
while True:
event = await queue.get()
# Immediately trigger asynchronous computation of the strategy layer
asyncio.create_task(execute_strategy(event))

async def execute_strategy(event):
print(f"⚡️ Captured signal: {event['pool']} -> Trigger pricing model calculation")

if name == "__main__":
asyncio.run(main_scout_loop())

6. Summary: The most sensitive radar

The design level of the Scout module directly determines the 'starting speed' of the robot. A good Scout should:

  • Be broad enough: Able to capture new opportunities through logs.

  • Be accurate enough: Able to lock price fluctuations through account subscriptions.

  • Be fast enough: Utilize asynchronous architecture and binary parsing to keep latency suppressed to the microsecond level.

Next step preview

Captured the signal and got the raw data, what should we do next? We need to convert the binary data into real asset prices. In the next article, we will delve into the AMM module, revealing how Raydium's constant product formula and Orca's concentrated liquidity mathematical model run at lightning speed in memory.

This article is written by Levi.eth, dedicated to sharing the ultimate engineering art in the Solana MEV field.

$SOL $JTO

JTO
JTOUSDT
0.2968
-3.51%

SOL
SOL
102.07
-3.09%