Sentinel Pyramid Builder powered by AWS Lambda- when visualization performance is crucial

Home > Blog > Sentinel Pyramid Builder powered by AWS Lambda- when visualization performance is crucial

A Need for Sentinel Pyramid Builder

Sentinel Hub services are based on on-the-fly processing of source satellite imagery. This fact ensures the system to be cost effective on a global scale (no need to pre-process the data several thousand times a day, when a new scene comes) and very flexible (changes often require only refresh of the image, in worst case redeploy of the system; there is no need for re-processing). We've put in significant effort to improve performance of the system and currently average tile requires between 1-2 seconds to generate, which is fast enough for most of our users. 

However, there are cases, when one needs to have data available in the fastest possible way.  A typical example being disaster relief crowd-sourcing effort, when large number of users is trying to get images from the same area. Nothing beats pre-rendered pyramid in such cases and we made it easier than ever to create one. We have integrated this option in our WMS Mosaic Generator, which we use to provide efortless export of Sentinel data—now it is also possible to generate a pyramid, hosted on Amazon Web Services. The pyramid is tiled in standard TMS format, which can be easily added to any web application.

Below is a sequence of screenshots demonstrating the workflow based on an example of a pyramid for Corsica.

User chooses the proper configuration, acqusition date, rendering options, etc. A preview makes it possible to see what the result will be.

It takes a few minutes to generate a pyramid for a mid-size country area.

Pyramid displayed using leaflet JavaScript library. Browse the result of the example from the screenshots here.

As browsable examples we have put online pre-generated pyramids for Italy, Madagascar, and Cuba, Jamaica and Hispaniola.

A Peek Under the Hood - Understanding How Lambda Works

Let's take a peek at how the service works underneath the covers.

The pyramid builder generates a pyramid for a region of interest a rectangular bounding box which currently has to be specified in EPSG:3857 from imagery generated based on an array of parameters, such as cloud coverage, date range, product (e.g. true colour, false colour, NDVI, etc.) , image format (e.g. PNG, JPG, etc.), and so forth. In this post we define a pyramid as a collection of images corresponding to tiles, from all zoom levels, that intersect the region of interest. The result is an online directory containing the pyramid in Google Maps directory structure. The pyramid is comprised of Web Mercator 256 px tiles.

Lion's share of the processing happens on AWS Lambda, Amazon's event-drive computing service, and the results get stored to AWS S3. Very roughly, one uses AWS Lambda by providing a class that implements a suitable interface from AWS SDK (there are SDKs available for Java, Python, and JavaScript), and uploads the project containing the class to the AWS. The function has to be lightweight: it can use at most 512 MB of disk space, is allowed to write to the /tmp directory only, and is not guaranteed any persistency (files written in this invocation are not guaranteed to exists on the next invocation); the main memory is limited to 1024 MB; the number of threads to 1024; the CPU time to 5 minutes; see AWS Lambda Limits for details. One usually writes results to either AWS S3 or AWS DynamoDB. Amazon charges the service usage based on the CPU time spent by the lambdas from a given AWS account.

When a user submits a requests for a given bounding box, the service determines the level-11 tiles that intersect the bounding box and invokes a lambda task for each tile. Each task retrieves 4096 px-times-4096 px image for the bounding box (using the renderer service from our internal infrastructure, the working of which exceeds the scope of this post). It then cuts up the image into 256 level-15 tiles (each 256 px-times-256 px), 64 level-14 tiles (each scaled down to 256 px-times-256 px), ..., and a single level-11 tile (scaled down to 256 px-times-256 px). We scale down tiles at level 15-i by factor 2i. Currently we scale down by coalescing 22i neighbouring pixels into a single (mean) pixel. There are alternatives to averaging available when it comes to scaling images, however, averaging works well enough for our current purposes. See figure below.

When all lambda tasks finish, the S3 bucket contains 256-times-256 px images (generated based on chosen parameters), one per each tile from zoom levels 11, 12, 13, 14, and 15. Level 11 is our "base" layer. By composing 4 neighbouring images, I00, I01, I10, and I11, into a big 512 px-times-512 px image I, and scaling I down by 2, we end up with an image for a level-10 tile. Continuing in this manner, we invoke a lambda job for each level-10 tile. See figure below.

As soon as level-10 images are on S3, we repeat the last step to generate level-9 images from the ones forming level 10; and so on down to the last level.

These last steps, that generate level-i images by combing neighbouring images from level i+1, take much less time than the first one. Suppose that level 15, our most-detailed level, contains n tiles. Then level 14 contains n/4 tiles, level 13 n/42 tiles, and so on; in general, level 15-i contains n/4i tiles. Since I/O operations in this case reading and writing to S3, and retrieving images from our infrastructure are the ones that are the most expensive, we can estimate the running time based on these. For simplicity, we'll only estimate the number of writings to S3; estimating the number of readings is left as an exercise for the interested reader. There are n S3 PUT requests for level-15 images, and, more generally, n/4i for images at level 15-i. We can thus bound the number of S3 PUT requests by

Σ0≤i≤15n4i ≤ n Σ i≥0 4-i = n43,

where the last equality follows from an old result about geometric series. This is where the well-known factor of 1.3 comes from (4/3 = 1.333...). This is useful for estimating the size of the resulting pyramid: If we know that n level-15 tiles intersect our region of interest, and that each image in the pyramid takes up about 150 KBs of space, then the whole pyramid will take up about 150 * n * 43 KBs. It is similarly useful for estimating the time to generate a pyramid for a given region, assuming we know the number of tiles forming the most-detailed level and the mean times of retrieving and generation steps.

Since we process levels 15, ..., 11 in the first step, the latter does more than (1 + 1/4 + ... + 1/256) / (4 / 3) ~ 0.999 amount of work. Practically all the work! This means there's still some manoeuvring room available e.g. do as much work as possible, perhaps even all the work, with a single lambda in the combining step. A lambda invoked as part of the first (i.e. retrieval) step takes about 2 minutes to execute, while a lambda invoked as part of the second (i.e. generation) step takes only a few seconds.

We are currently running at most 280 lambda tasks concurrently because of a limit on the number of S3 PUT requests per second. In practice we limit the number of concurrently executing lambdas by maintaining a thread pool of size 280 that holds lambda invocations.

For large areas, roughly corresponding to, say, Italy or Germany, the number of tiles can become quite large, and the larger the area the more likely it is that a lambda task will fail. This happens for various reasons: Data for the given criteria may be unavailable, a HTTP connection might fail, etc. Even if the probability of a lambda task failing is 1 in 10 000, once we have 200 000 tiles at the most-detailed level, a typical number for a big pyramid, we can expect about 26 lambdas to fail. This means we better have a way of dealing with failures. Our current failure handling is very primitive if a task fails, we put it back into the thread pool. We retry a task at most 10 times; if all retries fail, the pyramid contains a hole in place of the image we failed to generate. For an example of holes in a pyramid due to lambda task failures, see this pre-generated pyramid for Germany.