-
Notifications
You must be signed in to change notification settings - Fork 2
A non R user's guide to ChatStat
We recognise that R is a less-common language for many users of Matrix. In this doc, we'll walk through how to get it set up and running.
I personally the RStudio IDE, which is free for non-commercial use. However, you can just install R itself, and use it from the commandline - but you won't get such an easy interface to experiment with plots.
Either way, you'll need R - you can get it from your package manager, if you have one, or from https://cran.r-project.org/. You will need R 4.1 or higher.
Once installed, check you can fire up your R shell (either within RStudio, or at the cmdline)
For this tutuorial, we'll need a few packages. R uses CRAN as a package index, but ChatStat isn't on CRAN, so we have to do an extra thing for that.
install.packages("tidyverse")
install.packages("remotes")
remotes::install_github("GregSutcliffe/ChatStat")
Hopefully that all goes well! If not, check for any dependency errors, and let us know so we can update this wiki
For this example I'm going to use the This Week In Matrix channel and get 7 days of data. You'll also need a Matrix access token for your account (you can get it from the settings page in Element)
library(tidyverse)
library(ChatStat)
Sys.setenv('token' = 'syt_my_token',
'host' = 'my_matrix_homeserver.org')
Sys.setenv(LOG_LEVEL='DEBUG')
df <- get_rooms('!QQpfJfZvqxbCfeDgCj:matrix.org','2021-12-10 00:00:00')
df
If all goes well you should see something like this:
# A tibble: 1 × 2
# Groups: room [1]
room events
<chr> <list>
1 !QQpfJfZvqxbCfeDgCj:matrix.org <tibble [819 × 7]>
That means we have 819 events from the room. If you pass a list of more than one room, you'll get one room per room, but we'll leave that for another tutorial.
Now we'll unnest it and make a graph of messages per day.
r |>
# expand the rows, gives us 819 rows
unnest(events) |>
# we have all events, so filter for messages and reactions
filter( !is.na(body) | type == 'm.reaction') |>
# truncate the sending time to just the date
mutate(day = as_datetime(cut(time,'day'))) |>
# count the number of events in each date
count(day) |>
ggplot(aes(day,n)) +
geom_col() +
labs(title='Messages per day',
subtitle='Based on messages and reactions',
x = 'Date', y = 'Count') +
guides(fill='none') +
theme_minimal()
With a bit of luck, you'll get a chart like this:
From here, you can go nuts with either the tidyverse code to slice the data different way, or the ggplot2 to display it in different ways. Stay tuned for more examples!