Service Abstractions

In this section, we discuss how communication on computing services raises the level of abstraction in the CONA stack, and reduces the complexity within and across layers.

1. Computing Service Access

At the core of the convergence layer is a new Computing Access Protocol, called CAP, which enables users to access a geo-distributed computing network. Users can access the computing service using Computing Identity (CID) and process data generated by apps/services on backends owned by the user (instead of the service provider).

Instead of connecting to a central point, CONA is designed to be a service-oriented platform where communication is driven by CIDs rather than fixed addresses. A CID corresponds to a group of (possibly changing) processes offering the same computing service. Applications can use CIDs to express their intent to access computing services directly. This elevates computing services to first-class network entities (distinct from a centralized host or interface) that can be dynamic and hosted by decentralized computing service providers.

CAP can be programmed through a user-space control plane acting on service-level events. This gives network programmers hooks for ensuring service-resolution systems are up-to-date.

Contributors and users can publish/subscribe to computing services to/from CONA. The CONA stack offers a clean service-level control/data plane split: the user-space service orchestrator can manage service resolution based on policies, listen for service-related events, monitor computing performance, and communicate with other orchestrators; the CAP provides a service-level data plane responsible for connecting to computing services through forwarding over service tables. Once connected, the orchestration layer maps the new flow to its socket in the flow table, ensuring incoming packets can be demultiplexed. Connectivity can be maintained across physical mobility, virtual migration, and churn of computing nodes. In the long-term roadmap of CONA, applications interact with the stack via name-based sockets that tie socket calls (e.g., bind and connect) directly to service-related events in the stack. These events cause updates to the data-plane state and are also passed up to the control plane (which subsequently may use them to update resolution and registration systems).

As such, CONA gives service providers more control over computing access, and clients more flexibility in resolving computing service providers. For instance, by forwarding the first packet of a connection based on the CID, the CAP can defer binding a service until the packet reaches the part of the network with fine-grain, up-to-date information. This ensures more efficient load balancing and faster failover. The rest of the traffic flows directly between endpoints according to network-layer forwarding. The CAP performs signaling between endpoints to establish additional flows (over different interfaces or paths) and can migrate them over time. In doing so, CAP provides a transport-agnostic solution for node failover.

To handle a wide range of computing services and deployment scenarios, the blockchain disseminates CID prefixes (hashed names) to DONA. At the same time, the CAP applies rules to packets, sending them onward — if necessary, through computing service routers deeper in the network — to remote service instances. The CAP does not control which forwarding rules are in the service table, when they are installed, or how they propagate to other nodes. Instead, the local service controller (i) manages the state in the service table and (ii) potentially propagates it to other service orchestrators. CONA is responsible for registering, resolving, and routing services, which supports distributed service deployment scenarios.

2. Computing Service Registration

CONA maintains a service naming system as a separate logical layer on top of the underlying blockchain on which it operates. CONA uses the underlying blockchain to achieve consensus on the state of this naming system and bind computing services to providers.

In the aspect of resource management, the orchestration layer achieves information from the convergence layer and manages the life cycle of computing nodes from beginning to end. The employment of computing resources and network resources is conducted through the orchestration layer, with metadata stored in the convergence layer and pointers sent to the blockchain.

Blockchains have limited bandwidth with the unacceptable worst-case performance of lookups. The solution is to perform service registration off-chain, generate a proof, and send the proof on-chain. Currently, CONA uses S/Kademlia distributed hash table (DHT) for service operations (registration, resolution, and routing). S/Kademlia is a structured peer network that avoids a number of known attacks in the traditional Kademlia network. Nodes in the peer network maintain a connection to a subset of other peers on the network. The S/Kademlia stores service files for CONA that are identical to DNS zone files. Pointers to specific services are stored service files managed by the peer network, while the actual services are hosted on computing backends. The peer nodes are programmed not to accept service file writes unless a hash of the service is present in the blockchain.

Separation of the Control and Service Plane: CONA decouples the security of name registration and name ownership from the availability of data associated with names by separating the control and data planes. The control plane defines the protocol for registering services (identified by service names), creating (service, hash) bindings, and creating bindings to owning cryptographic keypairs, which is a logically separate layer on top.

The service plane is responsible for storing service information, mainly the service-hash pairs. It consists of (a) service files for discovering service by hash or URL and (b) external storage systems for storing service information. Service names are signed by the public keys of the respective service owners. Devices receive services from the service plane and verify their authenticity by checking that either the services hash is in the service file, or the service includes a signature with the name owner’s public key.

3. Computing Service Resolution

Services are resolved through the peer network. A hash-based CID is forwarded through service tables distributedly maintained by CAP, ultimately registering or resolving with a node responsible for the CAP. This resolution can coexist with an IANA-controlled resolution hierarchy, however, as both simply map to different rules in the same service table.

DHTs such as Kademlia require multiple network round trips for many operations, which is difficult to achieve millisecond-level response times. To speed up the response time, we add a basic decentralized caching service on top of the peer network. The caching service will live independently in each peer node and attempt to talk to every computing node in the network periodically. The caching service will then cache the last known good address for each computing node, and delist nodes that it has not talked to after a certain period of time. Computing nodes do not need to know about the caching services. We expect the caching service to scale for the reasonable future, as ping operations are inexpensive, but admit a new solution may ultimately be necessary. Generally, space requirements are acceptable. For example, caching for a network of 500K nodes can be done with around 35MB of memory.

With each Kademlia message shared on the network, nodes will include their available computing power, RAM capacity, disk space, per-node bandwidth availability, wallet address, and any other meta-data the network needs. The service resolution cache will collect this information provided by the nodes, allowing faster lookups.

The S/Kademlia DHT is used for the pilot implementation, but may not be ultimate. We believe that the industry is still relatively young and evolving, and it is too early to pick a winning DHT technology. It is hard to predict which DHT technology will be operational and reliable five years from now. The modular structure of CONA provides full compatibility with new DHT technologies, which enables the system to be self-evolved.

4. Computing Service Routing

The service files stored in the peer network serve as pointers to the node that provides the computing service. The route to the service is returned by the blockchain to the user. When the previous service provider moves or gets churn (which can be quickly discovered by the CONA stack), the route for the particular service is redirected to an alternative provider. Meanwhile, the service file of the previous service provider is updated. In such a manner, CONA supports ubiquitous mobility and service migration handling node churns.

The peer network only stores service files if the service was previously announced in the blockchain. This effectively whitelists the service that can be hosted by CONA. The key aspect relevant to the design of CONA is that routes (irrespective of where they are fetched from) can be verified and therefore cannot be tampered with. Further, most production servers that can be used by peer nodes maintain a full copy of all service files since the size of service files is relatively small. Keeping a full copy of routing data introduces only a marginal storage cost on top of storing the blockchain data.

Last updated