Cloud Crawling – How tomorrow’s cloud native application may operate (Part 2)

This article is continuation of Part 1. The cloud crawling solution provides a novel method of realizing a cloud-based architecture in which an enterprise can utilize clouds from different vendors across the world to locate its application, data and assets closer to the end user. In this article, I present a conceptual solution for it.

Cloud Crawling Solution

The solution comprises of four separate concerns:

  1. Cloud discovery
  2. Cloud selection
  3. Term negotiation
  4. Cloud orchestration
Cloud discovery

As the number of cloud vendors grows, it isn’t a stretch to assume the availability of a third party cloud discovery services in the future. These services will provide a directory lookup for different cloud vendors, very similar to the Domain Name Service. In this case, the service maintains an updated list of all the clouds for different regions and when queried by a region it returns a list of end points for all the clouds serving that region. The end points are addresses of cloud agents. A cloud agent represents the entry point for a cloud. It helps negotiate terms and enable deployment on the cloud.

The cloud discovery process finds all the available clouds for a given region by querying a cloud discovery service.

The following sequence diagram demonstrates the cloud discovery mechanism,

Cloud selection

The Cloud selection process selects a cloud from the list of clouds found by the cloud discovery process for a given region. The cloud that is selected will be the one on to which the application will then deploy itself.

The cloud selection process may compare the different cloud options using certain metrics such as pricing, available capacity, QoS statistics, past usage experience, SLAs, PCI compliance, number of PoPs etc.

The following sequence diagram demonstrates the cloud selection mechanism,

Term negotiation

The Term negotiation process kicks in after the cloud selection process has selected a cloud vendor. As part of the term negotiation process, the application and the cloud agent exchange their terms of use and accept or reject the contract. The contract may be binding on the cloud vendor to ensure the desired QoS for the application.

Cloud orchestration

Cloud orchestration acts as a supervisor and manages all the other cloud crawling processes – discovery, selection and term negotiation. Orchestration can be demand based or schedule based.

The cloud orchestration and all the processes which it manages make up the cloud crawling module of the application.

Example Application –

Demand based orchestration

In this approach, if the application discovers that most of the requests are coming from a particular region then the cloud orchestration module will initiate a crawl to a cloud in that region to move closer to the user base and better serve the demand.

e.g. In the figure below, the application is first brought up or seeded on a cloud in Region x. It then starts receiving request from several clients in Region y (1). As a response to the demand from Region y, the cloud orchestration module migrates the complete application to a cloud in Region y – cloud A (2). The clients in Region y are now served through cloud A in the same region thus lowering network latency (3). After some time, the traffic monitoring component registers several requests from clients in Region z (4). Again, the same steps are repeated and the application is migrated to cloud B, a cloud in Region z (5). The requests from clients in region z are now served by the deployment in cloud B (6).

Time based orchestration

In this approach the cloud orchestration initiates the crawling process as per a predefined schedule.

e.g. In the figure below, the application is first brought up in Region x and a schedule is defined as follows,

  • Time T0 – Region x
  • Time T1 – Region y
  • Time T2 – Region z

The orchestration module dutifully follows the schedule and deploys the application on Cloud A which is in Region y on time T1 and then cloud B which is in region z on time T2.


Multi-cloud deployment is already a reality and cloud crawling takes it one step further. I hope you enjoyed reading this article and would love to get your feedback.


Cloud Crawling – How tomorrow’s cloud native application may operate


With the proliferation of cloud technologies, computing resources, storage resources and even streaming resources have been almost reduced to a commodity which can be reserved, utilized and then relinquished in real-time without the need of exclusive ownership.

Currently a handful of cloud vendors such as Amazon, Microsoft and Google dominate the cloud computing landscape with their cloud offerings. However, regional clouds backed by mid-size companies are also gradually becoming significant. Regional clouds have a limited PoP and may be limited to a given region. However, they have the advantage of being flexible enough to conform to different domain specific restrictions that customers may demand [1]. Regional clouds, above everything, help make the cloud market-space more competitive.

In the future there will be far more choices of clouds for any organization to choose from. These choices will consist of a broad mix of regional and global clouds. Furthermore, there are already standards in the making which allow applications to communicate with any cloud vendor using a vendor-neutral or cloud agnostic protocol [2].


This myriad of cloud choices across the globe combined with the idea of dynamic reservation of resources [3 & 4] could let an application “follow its users” across regions and countries without any manual Dev-ops intervention. An application will be able to itself negotiate with different cloud vendors, reserve resources and migrate-in and migrate-out as needed. This is what I term as – Cloud crawling, aprocess of dynamic resource migration across different clouds on-the-fly.

Lets look at an example. Imagine a news service provider which provides daily world news and videos. The service is completely hosted on the cloud. Let say the service is initially seeded on a cloud geographically located on the west coast. As viewers from different continents and countries consume the news service, the service on the fly and without any manual intervention negotiates with different cloud vendors located close to its viewers. It deploys itself on those clouds by standing up processing instances and by ingesting assets on the clouds CDNs. It also un-deploys itself from a cloud based on the traffic or maybe based on a fixed schedule. In this manner, the service can potentially migrate across the globe and appear to be ubiquitous to all its users while serving content with minimal network latency.

The following diagram further illustrates this example of how the service which started on the US west coast over a period of time spawned across clouds and migrated across the world. At each point it discovered different regional, national and international clouds, selected the clouds based on certain criteria, negotiated service agreements and accomplished a new greenfield deployment. After the user load from any particular region decreased, it brought down its deployment and relinquished those cloud resources.

In essence, the application “crawls” across different clouds and regions. This builds upon the current technique of deploying the same application simultaneously on different clouds, known as cloud bursting. However, while cloud bursting is limited to providing redundancy, cloud crawling provides a truly dynamic deployment such that the application itself discovers and manages its own migration to different clouds.


Cloud crawling represents a truly cloud-agnostic and dynamic solution for cloud native applications of tomorrow. It helps avoid vendor lock-in and exploits the competition between cloud providers on an on-going basis.

If you found this article interesting, I invite you to Part-2 where I describe a high-level design of how Cloud Crawling can perhaps be implemented.


1.    The “Regional” Cloud: A Case Study


3.      Patent – US20110145393 – Method for dynamic reservation of cloud and on premises computing resources for software execution (

4.      Patent – US20100076856A1 -Real-time auction of cloud computing resources (