Ruleset testing #49

deoxykev · 2023-11-26T22:48:49Z

As per everywall/ladder-rules#3, we'll need to implement some robust testing.

The main challenge is to test after client-side JS rendering happens, which will probably mean we'll need a headless browser.

A test could look like this: https://github.com/everywall/ladder/blob/ladder_tests/tests/tests/www-wellandtribune-ca.spec.ts

And the results like this.

Perhaps we'll need some codegen in order to go from ruleset to test?

ladddder · 2023-11-27T21:25:16Z

Yes, this would be cool.

joncrangle · 2023-11-28T04:08:33Z

This seems like a very cool idea.

I played around with your example a bit, and I think we may be able to leverage Github actions to run a shell script whenever a ruleset yaml is uploaded or changed to generate and run a test. From your example, I changed await expect(page.getByText(paywallText)).toBeVisible(); to await expect(page.getByText(paywallText)).not.toBeVisible(); for the test with ladder so both tests pass and could be used for CI.

I used the following bash script to generate a Playwright test.

./generate_test.sh -i rulesets/ca/_multi-metroland-media-group.yaml > tests/_multi-metroland-media-group.spec.ts

generate_test.sh

#!/bin/bash

# Command-line argument parsing
while getopts "i:" opt; do
  case $opt in
    i)
      input_file=$OPTARG
      ;;
    \?)
      echo "Invalid option: -$OPTARG" >&2
      exit 1
      ;;
    :)
      echo "Option -$OPTARG requires an argument." >&2
      exit 1
      ;;
  esac
done

# Check if the input file is provided
if [ -z "$input_file" ]; then
  echo "Usage: $0 -i <input_yaml_file>"
  exit 1
fi

# Extract information from the "tests" section
url=$(awk '/- url:/ {sub(/- url: /, ""); sub(/^[[:space:]]*/, ""); print}' "$input_file")
domain=$(echo "$url" | awk -F/ '{print $3}')
test=$(awk '/test:/ {sub(/test: /, ""); sub(/^[[:space:]]*/, ""); print}' "$input_file")

# Generate Playwright test script
echo "import { expect, test } from '@playwright/test';"
echo
echo "test('$domain has paywall by default', async ({ page }) => {"
echo "  await page.goto('$url');"
echo "  await expect($test).toBeVisible();"
echo "});"
echo
echo "test('$domain + Ladder does not have paywall', async ({ page }) => {"
echo "  await page.goto('http://localhost:8080/$url');"
echo "  await page.waitForLoadState();"
echo "  await expect($test).not.toBeVisible();"
echo "});"

In the ruleset yaml, I put a Playwright locator in the test portion:

tests:
  - url: https://www.wellandtribune.ca/news/niagara-region/niagara-transit-commission-rejects-council-request-to-reduce-its-budget-increase/article_e9fb424c-8df5-58ae-a6c3-3648e2a9df66.html
    test: page.getByText("This article is exclusive to subscribers.")

At the moment, the bash script is pretty limited to just checking if a specified element is or is not visible. If we continue this way, we may want the script to be a bit more general so it can capture other scenarios. This may require anyone contributing a rule to be a bit more explicit in their ruleset tests section, so rather than contribute a Playwright locator they may need to provide the expectation with both a locator and an assertion:

    test: expect(page.getByText("This article is exclusive to subscribers.")).toBeVisible()

Some additional parsing in the bash script could insert a .not before the assertion for the ladder test.

deoxykev · 2023-11-28T13:59:17Z

Nice work!

I've been thinking about how to generate rules for any site, in an automated fashion. One of the main roadblocks is figuring out whether or not a site is paywalled, and to generate a test for it. I wonder if it's as simple as extracting visible text from a page, and asking an LLM whether or not it is paywall text is sufficient.

actuallymentor · 2024-08-31T11:14:39Z

I wonder if it's as simple as extracting visible text from a page, and asking an LLM whether or not it is paywall text is sufficient

@deoxykev I know LLMs are a blunt force object, but you could even use a screenshot instead of text. The headless browsers support this out of the box usually, and visual inspection often is easier than code logic for an LLM.

This could even be integrated in a Docker composition with ollama so the LLM calling is local.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ruleset testing #49

Ruleset testing #49

deoxykev commented Nov 26, 2023

ladddder commented Nov 27, 2023

joncrangle commented Nov 28, 2023

deoxykev commented Nov 28, 2023

actuallymentor commented Aug 31, 2024

Ruleset testing #49

Ruleset testing #49

Comments

deoxykev commented Nov 26, 2023

ladddder commented Nov 27, 2023

joncrangle commented Nov 28, 2023

deoxykev commented Nov 28, 2023

actuallymentor commented Aug 31, 2024