Skip to content
Alberto edited this page Oct 21, 2016 · 26 revisions

Transient is a general purpose library whose salient features are distributed processing and Web programming, where everithing is composable using standard haskell classes and operators (Applicative, Alternative, Monad).

(You can access to the Transient tutorial on the right of this page)

People asked me for a complete distributed transient application example which may run network communications among different processes.

This is a runnable example at: https://ide.c9.io/agocorona/transienttest

It is a distributed program with N nodes (independen processes) that is accessed with a Web interface made of widgets using many of the primitives of the transient stack. It shows three different distributed applications: A map-reduce widget which count the words in a text, a federated chat server and a node monitor which display the nodes connected.

The source code of the example has less than 200 lines of code including browser interface!:

https://github.com/agocorona/transient-universe/blob/master/examples/distributedApps.hs

transientt

transientt2

It is necessary to sign in in Cloud9. Then, clone this project. Then you may be able to run your own copy within your own personal cloud9 instance.

In the bash console, execute three application nodes with init.sh:

agocorona:~/workspace $ cd transientinstaller
agocorona:~/workspace/transientinstaller (master) $ cat init.sh 
./distributedApps  -p start/localhost/8080 &
./distributedApps  -p start/localhost/8081/add/localhost/8080/y &
./distributedApps  -p start/localhost/8082/add/localhost/8080/y & 

agocorona:~/workspace/transientinstaller (master) $ ./init.sh

Executing: "start/localhost/8081/add/localhost/8080/y"
Executing: "start/localhost/8080"
Executing: "start/localhost/8082/add/localhost/8080/y"
.. more messages ...
Known nodes: 
[("localhost",8082,[]),("localhost",8080,[]),("localhost",8081,[])]

A cluster of three nodes will be started.

agocorona:~/workspace/transientinstaller (master) $ ps  | grep distrib
   3179 pts/1    00:00:00 distributedApps
   3180 pts/1    00:00:00 distributedApps
   3181 pts/1    00:00:00 distributedApps

Connect with the first node at port 8080 by pointing the browser to:

http://transienttest-youruser.c9users.io/

to enter in the second node: http://transienttest-youruser.c9users.io:8081

to enter in the third node: @http://transienttest-youruser.c9users.io:8082

All widgets execute in the cluster. This means that chat messages are propagated to all the nodes and you can chat across browsers connected to any of the nodes. and map-reduce requests initiated in one of the nodes are executed in the three nodes. The node list is updated when a node is added or deleted.

To know what each element of the program does, the best option is to take a look at the tutorial. There are a lot of unconventional things going there in order to preserve composability, so be patient. This is not simple to understand, but it is simple to use. I will expand the documentation as soon as I can.

Warning: This example is at the limit of the capacity of the free cloud9 instance. Please do not send big queries to map-reduce. this will shut down the conections and will be no response. kill the processes and restart them again with init.hs. But don´t be too schematic: Due to some bug in distribute The map-reduce example works with texts with tree or more words.

if you want to compile it in your machine:

  > stack install ghcjs-hplay  --compiler ghcjs-0.2.0.9006020_ghc-7.10.3
  > stack install ghcjs-hplay
  > mkdir static
  > ghcjs -o static/out distributedApps.hs
  > ghc distributedApps.hs

to run nodes, use init.hs as example. It run three nodes and connect them:

> cat init.hs
./distributedApps  -p start/localhost/8080 &
./distributedApps  -p start/localhost/8081/add/localhost/8080/y &
./distributedApps  -p start/localhost/8082/add/localhost/8080/y & 

There are ephemeral instances running now at: http://transienttest-agocorona.c9users.io for 24h aprox from this notice to play with them if you don't want to clone and run your own.

Drop me a line in the chat!. I will be connected.

The examples represent different distributed computing architectures: The chat example represent the Erlang-akka-cloud haskell actor model where messages are transmitted by mailboxes. But the composability is preserved since there is no blocking and because remote execution of remote closure can be done at any place of the code. They do not need to be static closures neither the binaries should be identical. This allows communication between browser and server using the same cloud primitives.

The chat example uses local subscriptions and broadcasting (clustered) to have the effect of subscription to the network. It is possible to create an abstraction to hide these details but I consider that this would be less flexible.

This subscribe to the events produced in the local node:

 resp <-  local $ getMailbox chatMessages

 getMailbox :: Text -> TransIO a
 local :: TransIO a -> Cloud a

but each message is written in all the nodes:

 clustered $ local $ putMailbox chatMessages (showPrompt nick node ++ text )  >> empty :: Cloud ()

 clustered :: Loggable a => Cloud a -> Cloud a

clustered executes his argument in all the nodes connected. Since I'm not interested in the responses, I avoid receiving extra messages back from all the nodes by adding empty which stop the threads executing the remote calls.

But clustered only broadcast to the server nodes. Since the program start in the browser node and there is a communication with the server opened by wormhole at initNode, to start watching for messages in the server node it is necessary to move to it with atRemote

resp <- atRemote .  {- now in the server -} local $ getMailbox chatMessages

atRemote :: Loggable a => Cloud a -> Cloud a

To make sure that there is only a single subscription running despite the multiple executions of of the chat option, it is necessary to use single.

resp <- atRemote . local . single $ getMailbox chatMessages

The map-reduce implements a spark-like dataflow:

execution of the map-reduce example for counting words in thre nodes

All the flow is executed in three nodes in this case. the same nodes perform the mapping, the suffling and the reduction.

The data is obtained from the text box in the browser. atRemote perform the map-reduction in the server of the browser, that act as master in the distributed computation. Finally atRemote return the result to the browser again, which render the results.

The node monitor is watching the list of nodes by means of sample each second. When it changes, it return the new list, which is forwarded to the browser.

Clone this wiki locally