RENDLER ⁉️

A rendering web-crawler framework for Apache Mesos.

See the accompanying slides for more context.

RENDLER consists of three main components:

CrawlExecutor extends mesos.Executor
RenderExecutor extends mesos.Executor
RenderingCrawler extends mesos.Scheduler and launches tasks with the executors

Quick Start with Vagrant

Requirements

VirtualBox 4.1.18+
Vagrant 1.3+
git (command line tool)

Start the `mesos-demo` VM

$ wget http://downloads.mesosphere.io/demo/mesos.box -O /tmp/mesos.box
$ vagrant box add --name mesos-demo /tmp/mesos.box
$ git clone https://github.com/mesosphere/RENDLER.git
$ cd RENDLER
$ vagrant up

Now that the VM is running, you can view the Mesos Web UI here: http://10.141.141.10:5050

You can see that 1 slave is registered and you've got some idle CPUs and Memory. So let's start the Rendler!

Run RENDLER in the `mesos-demo` VM

Check implementations of the RENDLER scheduler in the python, go, scala, and cpp directories. Run instructions are here:

Feel free to contribute your own!

Generating a pdf of your render graph output

With GraphViz (which dot) installed:

vagrant@mesos:hostfiles $ bin/make-pdf
Generating '/home/vagrant/hostfiles/result.pdf'

Open result.pdf in your favorite viewer to see the rendered result!

Sample Output

Shutting down the `mesos-demo` VM

# Exit out of the VM
vagrant@mesos:hostfiles $ exit
# Stop the VM
$ vagrant halt
# To delete all traces of the vagrant machine
$ vagrant destroy

Rendler Architecture

Crawl Executor

Interprets incoming tasks' task.data field as a URL
Fetches the resource, extracts links from the document
Sends a framework message to the scheduler containing the crawl result.

Render Executor

Interprets incoming tasks' task.data field as a URL
Fetches the resource, saves a png image to a location accessible to the scheduler.
Sends a framework message to the scheduler containing the render result.

Intermediate Data Structures

We define some common data types to facilitate communication between the scheduler and the executors. Their default representation is JSON.

results.CrawlResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    ["http://foo.co/a", "http://foo.co/b"]  # links
)

results.RenderResult(
    "1234",                                 # taskId
    "http://foo.co",                        # url
    "http://dl.mega.corp/foo.png"           # imageUrl
)

Rendler Scheduler

Data Structures

crawlQueue: list of urls
renderQueue: list of urls
processedURLs: set or urls
crawlResults: list of url tuples
renderResults: map of urls to imageUrls

Scheduler Behavior

The scheduler accepts one URL as a command-line parameter to seed the render and crawl queues.

For each URL, create a task in both the render queue and the crawl queue.
Upon receipt of a crawl result, add an element to the crawl results adjacency list. Append to the render and crawl queues each URL that is not present in the set of processed URLs. Add these enqueued urls to the set of processed URLs.
Upon receipt of a render result, add an element to the render results map.
The crawl and render queues are drained in FCFS order at a rate determined by the resource offer stream. When the queues are empty, the scheduler declines resource offers to make them available to other frameworks running on the cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 242 Commits
bin		bin
cpp		cpp
go		go
haskell		haskell
java		java
python		python
scala		scala
.gitignore		.gitignore
README.md		README.md
Vagrantfile		Vagrantfile
render.js		render.js
riddler.jpg		riddler.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RENDLER ⁉️

Quick Start with Vagrant

Requirements

Start the `mesos-demo` VM

Run RENDLER in the `mesos-demo` VM

Generating a pdf of your render graph output

Shutting down the `mesos-demo` VM

Rendler Architecture

Crawl Executor

Render Executor

Intermediate Data Structures

Rendler Scheduler

Data Structures

Scheduler Behavior

About

Releases

Packages

Contributors 18

Languages

d2iq-archive/RENDLER

Folders and files

Latest commit

History

Repository files navigation

RENDLER ⁉️

Quick Start with Vagrant

Requirements

Start the mesos-demo VM

Run RENDLER in the mesos-demo VM

Generating a pdf of your render graph output

Shutting down the mesos-demo VM

Rendler Architecture

Crawl Executor

Render Executor

Intermediate Data Structures

Rendler Scheduler

Data Structures

Scheduler Behavior

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 18

Languages

Start the `mesos-demo` VM

Run RENDLER in the `mesos-demo` VM

Shutting down the `mesos-demo` VM

Packages