NewsMonitor

A feed aggregator with topic-focussed discovery.

NewsMonitor may be run standalone locally or on a remote server as long as there's a writeable SPARQL store somewhere accessible over HTTP. It's written in Java and was originally intended as an OSGi module for the Fusepool (/Stanbol) system, though that aspect hasn't been maintained.

Status 2023-08-13

Started revisiting this month.

It almost worked right away. I had auth issues between it and the Fuseki server. I ended up stripping back all the auth bits. At which point I realised that I'd wasted a lot of time because my test queries weren't the same shape as the data (I was putting each feed into it's own named graph but when testing I assumed everything was going in the default graph). Oops.

Since writing this thing I have been building the headless apps around HKMS generally using what I call the SPARQL Diamonds pattern. In this pattern the browser client calls a SPARQL store and templates the results into HTML (and does similar with any input, form data or whatever : templates it into SPARQL queries with INSERT etc). As a lingua franca I've been using Markdown.

So with NewMonitor I want to shift to the same approach. Feed post content is saved in RDF literals as markdown. I've pretty much got this going server-side, I now need to play with the client browser app to add the markdown rendering to make it all useful/pretty.

As well as making NewsMonitor consistent with the other HKMS apps, a bonus is that it should help sanitise/normalise the raw data. This, at some point, I want to use as training data in a small language model, like Karpathy's llama2.c.

I pretty much abandoned NewsMonitor when I'd done enough for the contract, had other work to chase. The big thing I felt would benefit from back then was a little work on it's intelligence. If I remember correctly, when it discovers new feeds it does categorisation by string-matching on keywords. Something smarter, maybe k-nearest neighbours could be plugged in fairly easily.

But for now my aim is just to get it running again as a feed aggregator service with simple browser rendering. This is mostly for my own benefit, though the news page it'll make might be of interest to anyone that like AI, Linked Data, modular synths and/or woodcarving.

ToDo

reintroduce auth on updates/inserts
add markdown rendering to browser clients
add a smarter classifier
make everything useful and pretty

Running standalone (with external Fuseki SPARQL server)

Assuming there's a Fuseki server running on http://localhost:3030 with a dataset called "feedreader". 2023 : host:port has to go in standalone-config.properties

To build :

First git clone this repository, then -

cd NewsMonitor

mvn clean install -P build-for-fuseki

To run :

java -jar target/NewsMonitor-1.0.0-SNAPSHOT.jar it.danja.newsmonitor.standalone.Main

The following might not currently work

Integrated with Fusepool/Stanbol

Installation

cd to NewsMonitor directory, then :

mvn clean install

or, skipping tests :

mvn clean install -Dmaven.test.skip=true

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
.externalToolBuilders		.externalToolBuilders
.metadata		.metadata
.settings		.settings
.vscode		.vscode
2021		2021
2023		2023
bin		bin
data		data
docs		docs
lib/python3.11/site-packages		lib/python3.11/site-packages
non-maven		non-maven
specification		specification
src		src
.classpath		.classpath
.gitignore		.gitignore
.project		.project
LinkSet.java		LinkSet.java
Poller-what.java		Poller-what.java
README.md		README.md
build.properties		build.properties
build.sh		build.sh
cookies.txt		cookies.txt
devlog.md		devlog.md
doap.rdf		doap.rdf
doap.ttl		doap.ttl
doc-errors.txt		doc-errors.txt
doc.sh		doc.sh
fuseki.service		fuseki.service
lib64		lib64
nbactions.xml		nbactions.xml
new-config.ttl		new-config.ttl
newsmonitor.service		newsmonitor.service
pom.xml		pom.xml
server.js		server.js
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NewsMonitor

Status 2023-08-13

ToDo

Running standalone (with external Fuseki SPARQL server)

To build :

To run :

Integrated with Fusepool/Stanbol

Installation

About

Releases

Packages

Languages

danja/NewsMonitor

Folders and files

Latest commit

History

Repository files navigation

NewsMonitor

Status 2023-08-13

ToDo

Running standalone (with external Fuseki SPARQL server)

To build :

To run :

Integrated with Fusepool/Stanbol

Installation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages