-
Notifications
You must be signed in to change notification settings - Fork 28
Running example(s)
Transient is a general purpose library whose salient features are distributed processing and Web programming, where everithing is composable using standard haskell classes and operators (Applicative, Alternative, Monad).
(You can access to the Transient tutorial on the right of this page)
People asked me for a complete distributed transient application example which may run network communications among different processes.
This is a runnable example at: https://ide.c9.io/agocorona/transienttest
It is a distributed program with N nodes (independen processes) that is accessed with a Web interface made of widgets using many of the primitives of the transient stack. It shows three different distributed applications: A map-reduce widget which count the words in a text, a federated chat server and a node monitor which display the nodes connected.
The source code of the example has less than 200 lines of code including browser interface!:
https://github.com/agocorona/transient-universe/blob/master/examples/distributedApps.hs
It is necessary to sign in in Cloud9. Then, clone this project. Then you may be able to run your own copy within your own personal cloud9 instance.
In the bash console, execute three application nodes with init.sh
:
agocorona:~/workspace $ cd transientinstaller
agocorona:~/workspace/transientinstaller (master) $ cat init.sh
./distributedApps -p start/localhost/8080 &
./distributedApps -p start/localhost/8081/add/localhost/8080/y &
./distributedApps -p start/localhost/8082/add/localhost/8080/y &
agocorona:~/workspace/transientinstaller (master) $ ./init.sh
Executing: "start/localhost/8081/add/localhost/8080/y"
Executing: "start/localhost/8080"
Executing: "start/localhost/8082/add/localhost/8080/y"
.. more messages ...
Known nodes:
[("localhost",8082,[]),("localhost",8080,[]),("localhost",8081,[])]
A cluster of three nodes will be started.
agocorona:~/workspace/transientinstaller (master) $ ps | grep distrib
3179 pts/1 00:00:00 distributedApps
3180 pts/1 00:00:00 distributedApps
3181 pts/1 00:00:00 distributedApps
Connect with the first node at port 8080 by pointing the browser to:
http://transienttest-youruser.c9users.io/
to enter in the second node: http://transienttest-youruser.c9users.io:8081
to enter in the third node: @http://transienttest-youruser.c9users.io:8082
All widgets execute in the cluster. This means that chat messages are propagated to all the nodes and you can chat across browsers connected to any of the nodes. and map-reduce requests initiated in one of the nodes are executed in the three nodes. The node list is updated when a node is added or deleted.
To know what each element of the program does, the best option is to take a look at the tutorial. There are a lot of unconventional things going there in order to preserve composability, so be patient. This is not simple to understand, but it is simple to use. I will expand the documentation as soon as I can.
Warning: This example is at the limit of the capacity of the free cloud9 instance. Please do not send big queries to map-reduce. this will shut down the conections and will be no response. kill the processes and restart them again with init.hs
. But don´t be too schematic: Due to some bug in distribute
The map-reduce example works with texts with tree or more words.
if you want to compile it in your machine:
> stack install ghcjs-hplay --compiler ghcjs-0.2.0.9006020_ghc-7.10.3
> stack install ghcjs-hplay
> mkdir static
> ghcjs -o static/out distributedApps.hs
> ghc distributedApps.hs
to run nodes, use init.hs as example. It run three nodes and connect them:
> cat init.hs
./distributedApps -p start/localhost/8080 &
./distributedApps -p start/localhost/8081/add/localhost/8080/y &
./distributedApps -p start/localhost/8082/add/localhost/8080/y &
There are ephemeral instances running now at: http://transienttest-agocorona.c9users.io for 24h aprox from this notice to play with them if you don't want to clone and run your own.
Drop me a line in the chat!. I will be connected.
The examples represent different distributed computing architectures: The chat example represent the Erlang-akka-cloud haskell actor model where messages are transmitted by mailboxes. But the composability is preserved since there is no blocking and because remote execution of remote closure can be done at any place of the code. They do not need to be static closures neither the binaries should be identical. This allows communication between browser and server using the same cloud primitives.
The chat example uses local subscriptions and broadcasting (clustered
) to have the effect of subscription to the network. It is possible to create an abstraction to hide these details but I consider that this would be less flexible.
This subscribe to the events produced in the local node:
resp <- local $ getMailbox chatMessages
getMailbox :: Text -> TransIO a
local :: Loggable a => TransIO a -> Cloud a
but each message is written in all the nodes:
clustered $ local $ putMailbox chatMessages (showPrompt nick node ++ text ) >> empty :: Cloud ()
clustered :: Loggable a => Cloud a -> Cloud a
clustered
executes his argument in all the nodes connected. Since I'm not interested in the responses, I avoid receiving extra messages back from all the nodes by adding empty
which stop the threads executing the remote calls.
But clustered only broadcast to the server nodes. Since the program start in the browser node and there is a communication with the server opened by wormhole
at initNode
, to start watching for messages in the server node it is necessary to move to it with atRemote
resp <- atRemote . {- now in the server -} local $ getMailbox chatMessages
atRemote :: Loggable a => Cloud a -> Cloud a
To make sure that there is only a single subscription running despite the multiple executions of of the chat option, it is necessary to use single
.
resp <- atRemote . local . single $ getMailbox chatMessages
The map-reduce implements a spark-like dataflow:
All the flow is executed in three nodes in this case. the same nodes perform the mapping, the suffling and the reduction.
The data is obtained from the text box in the browser. atRemote
perform the map-reduction in the server of the browser, that act as master in the distributed computation. Finally atRemote
return the result to the browser again, which render the results.
The node monitor is watching the list of nodes by means of sample
each second. When it changes, it return the new list, which is forwarded to the browser.
| Intro
| How-to
| Backtracking to undo IO actions and more
| Finalization: better than exceptions
| Event variables: Publish Suscribe
| Checkpoints(New), suspend and restore
| Remote execution: The Cloud monad
| Clustering: programming the cloud
| Mailboxes for cloud communications
| Distributed computing: map-reduce