Skip to content

Latest commit

 

History

History
886 lines (719 loc) · 29.5 KB

activipy.org

File metadata and controls

886 lines (719 loc) · 29.5 KB

Braindump

Python is an imperative language; we won’t try to be “purely functional” but “hidden state” could really screw us, so we’ll try to reduce that as much as possible.

We need:

  • Use a gdbm backed activitystreams object storage
  • For extensions, pyld
  • verifiers
  • method dispatch

DMD store should store in a dictionary like: … with the key being the “@id” of the toplevel activitystreams object.

{"asobj": {"@type": "Object",
           "@id": "uuid:d773cb99-078b-496b-b3f0-012d3ade5930",
           "blah": "blah"},
 "private": {"blah": "blah"}}

We are going to have to handle inheritance manually, because there can be multiple types. We can’t use python’s inheritance system.

We need an “ASVocab” system to operate within. This one should have a memoized version of the json-ld expansion of the default activitystreams vocabulary, but it should also have a mapping of type URIs to ASClass objects.

The store will be used separately, should provide simple store and retrieve mechanisms.

Some complexity comes from the fact that in a “real world” system, we don’t just store and receive what’s been given to us. We need a way to trigger application-specific hooks.

Tasks

Remove “Billy Scripter” stuff from docs

Amy pointed out that it’s unnecessarily negative at one point

Handle that @id is now an alias for id, @type an alias for type

Consuming

We should be able to consume this:

post_this = core.ASObj({
    "type": "Create",
    "id": "http://tsyesika.co.uk/act/foo-id-here/",
    "actor": {
        "type": "Person",
        "id": "http://tsyesika.co.uk/",
        "displayName": "Jessica Tallon"},
    "to": ["acct:[email protected]",
           "acct:[email protected]",
           "acct:[email protected]"],
    "object": {
        "type": "Note",
        "id": "htp://tsyesika.co.uk/chat/sup-yo/",
        "content": "Up for some root beer floats?"}},
  vocab.BasicEnv)

Producing

This is less important. The AS2 core doc says that implementations need to treat “id” as an alias for “@id” and “type” for an alias for “@type”, but does not say these need to be the default representations.

Add optional thunk to ASType constructor: get_default_env

Should return the default vocabulary if none specified.

Handle all @context edges

Brainstorming

Probably what we ought to do is enforce that everything in an ASObj share a common @context setup.

This means, it doesn’t matter what’s on an @context coming into ASObj(), we slice it off and replace it with @context after the deepcopy_in (and all child objects, we don’t need an @context at all).

But we do need to be able to “ingest foreign material”, which means we should provide an Environment.ingest() method (or .ingest_foreign())

As for how to combine multiple contexts, it can be done via an array; see the local context part of the docs.

So that solves how to do things! (Though there may be some minor question as to what to do if the application’s @context is also an array… do we merge them? Will it work as-is?)

Handling when creating new ASObj()

Basically, strip off the existing @context recursively with deepcopy_in, and add our context instead, if any exists.

Environment.ingest(jsonld, imply_asvocab=True)

This should expand out a json-ld document then compact it to the Environment’s own context.

The question is, can we use the “grafted-on @context” required in the above description, or do we have

breaking off a child ASObj

Basically attach an @context to the json we have of it (though I guess this will happen automatically)

Archive

Add extra_context field to Environment

Add to init structure
Add on expansion
Oh wait remove on expansion

That didn’t make sense, because the extra context gets added to the asobj, so it doesn’t need to be implicit.

Maybe rename MethodId to MethodSpec

Switch from method “object symbols” to strings?

Currently we require setting up “object symbols” which self-document what a method does and etc, but they’re also slightly unwieldy to set up. It will be less precise, but maybe easier, to just use strings to represent what a method is.

Add memoization

General memoization function

Property memoization

Switch ASObj.__getitem__ to use deepcopy_jsobj_out

Or rather, we should specify both a deepcopy_jsobj_in and a deepcopy_jsobj_out :)

So, if we’re accessing a key value pair where the value is a list of activitystreams objects, we’d like the activitystreams objects converted to ASObj objects as well.

Add deepcopy_jsobj_out

Use it in ASObj.__getitem__

Rename deepcopy_jsobj -> deepcopy_jsobj_in

Add tests

Add ASProp

Add demos section

Vocab demo

RoyalCheckin or CheckUp

  • checkup:CheckIn
  • checkup:RoyalStatus
  • checkup:Coupon

linter/validator

We can use the method dispatch system to handle this.

Archive

Easy GDBM based storage system

Documentation basics

Tutorial

Document basic “types” structure

Archive

Add sphinx basic structure

Documentation structure

  • Intro
    • About ActiviPy
    • Tutorial
  • Core types
  • Vocabulary
  • Extending the environment
  • Advanced Examples

Rename CheckUp demo to VisitIt everywhere

code

docs

Make ASVocab more useful

How to do this?

We want to:

  • probably preload a json-ld context
  • Somehow make ASVocab objects useful for a
  • make ourself more useful to ASObj objects

Tests

Test all types.py stuff

ASVocab

ASObj

ASEnvironment

Archive

ASType

Basic vocabs stuff

Archive

Basic test infrastructure

Consider rename to Pydraulics?

After all, I’m the one who started that project, and it’s abandoned…

Investigate restructuring ASType instances via metaclassing

Basically, the main reason is that we’d like to be able to do:

help(CollectionPage)

and get the appropriate useful info.

However, it’s still true that calling CollectionPage() should return a ASObj object, not a CollectionPage() object. Reason being that ActivityStreams objects can have multiple “@type” fields.

Archive

Add license stuff

Add license files

Add note on why both apache v2 and gplv3 to COPYING

Add copyright headers and a note on convention

Fill in complete vocabulary

CANCELED Switch to pyrsistent for ASObj structures?

https://github.com/tobgu/pyrsistent

We more or less force/fake immutability right now, and maybe it makes more sense to just use something that is immutable

UPDATE: Canceled. More info on why Pyrsistent has a promising future, but can’t work for now.

CANCELED Command line test suite

This is its own project now. See this issue.

Relevant parts of convo

<evanpro> paroneayea: so, a couple of questions on that <evanpro> Does having a single package that is a producer and a consumer make sense? Or multiple packages? [12:18] <paroneayea> evanpro: my first goal is to make a library for the purpose of tests, basically along the lines of how you suggested… it’ll just store @id’s to a gdbm store. But I’ll design it in a way that afterwards, it can be used for something like pypump, and for using as2 stuff <paroneayea> but my first goal is: fulfill the test requirements <evanpro> Whoa! <paroneayea> while working towards something more general <paroneayea> gdbm is oldschool I know <evanpro> Wait what’s the GDBM for? <evanpro> I don’t understand what you need persistence for [12:19] <paroneayea> well it could also just be a dictionary <evanpro> Wouldn’t an AS2 library do something like <paroneayea> I was going along with your suggestion that you have a command-line submission tool <evanpro> JSON -> native language object <evanpro> and native language object -> JSON <paroneayea> evanpro: yes <paroneayea> evanpro: ok well maybe it can be in-memory only [12:20] <paroneayea> evanpro: my main concern is get the thing working <evanpro> 1s <evanpro> So I was thinking that a test command-line app might look like this <evanpro> https://gist.github.com/evanp/b49c3fc37caa21a323a1 <strugee> hey, would it be useful if I created next week’s meeting page and filled it with the stuff on the agenda that we didn’t get to? <strugee> e.g. we missed branching models <evanpro> strugee: YES! [12:23] <evanpro> Nice <paroneayea> evanpro: that might work nicely <strugee> will do <paroneayea> evanpro: okay, I will probably do something like that [12:24] <evanpro> paroneayea: and then a test driver would work like this <evanpro> https://gist.github.com/evanp/5d80c0aa3f168465d84d <evanpro> So that way you could call “testdriver.py dumpactivitytype.py” [12:25] <evanpro> as well as “testdriver.py dumpactivitytype.rb” <paroneayea> evanpro: ok <paroneayea> evanpro: I see <paroneayea> evanpro: we also want a way to show mutations [12:26] <paroneayea> evanpro: and side effects <paroneayea> eg update verbs should actually update the thing in store <evanpro> That might be too much for a data format to deal with <paroneayea> evanpro: I mean, for the test suite <evanpro> Yes, that’s what I’m saying <paroneayea> we want to be sure that activities can actually do the things they promise <evanpro> What I’m saying is that no we don’t [12:27] <evanpro> When we’re testing the social API, definitely <paroneayea> evanpro: this is why I was saying that there’s not much to do as in terms of a test suite <evanpro> But I think an activity streams library should just parse from JSON and export to JSON <paroneayea> the only thing your example checks really is that it’s valid right? <paroneayea> that it’s json, has the right fields, in the right types <evanpro> It checks that the activitystreams implementation library (the one that the dumpactivitytype.py script imports) can find the type of an activity [12:28] <evanpro> I realize that it appears to be really trivial <evanpro> But you’d need dozens of such test scripts [12:29] <evanpro> dumpactivityactortype.py <evanpro> dumpactivityactorid.py <evanpro> That kind of thing <paroneayea> evanpro: okay, so I’ll definitely support this. <evanpro> Another possibility is using command-line arguments <paroneayea> evanpro: though, one of the things is, the activitystreams vocabulary does describe things with side effects <paroneayea> I might test for that too, but I won’t make it so complex that you can’t do the simple tsts you ahve [12:30] <evanpro> That’s probably a fair point <evanpro> I would really, really strongly recommend that you first publish your intentions for the test format <paroneayea> evanpro: to the list? <evanpro> And that you concentrate on the bare minimum first <evanpro> Yes <paroneayea> evanpro: okay I’ll do that <evanpro> to the list [12:31] <paroneayea> evanpro: I was planning on working on deployment stuff this week, but it seems like this has become really urgent <paroneayea> so I’ll make it priority #1 <evanpro> So, one thing we can do when we have even a rudimentary test suite <evanpro> Is that we can start testing libraries <evanpro> And so we can start writing libraries [12:32] <paroneayea> evanpro: right <evanpro> We could even have a hackathon to implement in a lot of different languages <evanpro> And push implementations to npm, Ruby gems, pypi, etc. <paroneayea> evanpro: anyway, maybe now you can see why I was looking at gdbm; if we do have a command line test thing and we do promise to deliver tests on side effects <paroneayea> we need some way to persist things <paroneayea> but <paroneayea> I agree <paroneayea> there are tests that don’t need that <evanpro> Right, I hear you <paroneayea> focus on the other stuff first. <evanpro> They seem trivial but they are so important [12:33] <evanpro> Probably the big thing is defining what the interface between testdriver script and the tested script is <paroneayea> (and the reason why gdbm is even though it’s oldschool, it’s also dead easy to get working because it’s so “dumb”) <paroneayea> evanpro: right. <evanpro> Oh, yeah, GDBM is fine there <evanpro> I might suggest using command-line args, too [12:34] <paroneayea> evanpro: I get why you had a “don’t engineer this, chris!” reaction though :) <evanpro> maybe something like this <paroneayea> er <paroneayea> overengineer <evanpro> <dumpscript> –activity-part actor –part-property id <filename> <evanpro> <dumpscript> –activity-part=actor –part-property=id <filename> [12:35] <evanpro> Those are crummy names but 🤷 <evanpro> That way implementers don’t have to write 50 different testing shims <paroneayea> evanpro: I hear you <paroneayea> evanpro: well, it may even be easier [12:36] <evanpro> It may also be worthwhile to have a producer test <paroneayea> –extract [“actor”][“@id”] <evanpro> That takes in some parameters and outputs some JSON <evanpro> Sure <evanpro> I’d be a little worried about defining a query language <evanpro> But yeah <paroneayea> evanpro: it’s probably equally complex to define a billion arguments <evanpro> So a producer script might take arguments like this <paroneayea> for the different components [12:37] <evanpro> agreed! <evanpro> <buildscript> –actor-id=urn:test:whatever –actor-name=”Evan Prodromou” –activity-type=”Like” –object-id=urn:test:whatever2 –object-name=”This terrible test” [12:38] <evanpro> But yeah pretty nightmarish <paroneayea> evanpro: so is the idea that this should spit out a success/failure code or <evanpro> Oh, no! <evanpro> It should spit out JSON! <paroneayea> just extract the right part? <paroneayea> okay <paroneayea> evanpro: and it should validate, right? [12:39] <evanpro> dumpscript == take JSON, just spit out some extracted part of it <evanpro> buildscript = take params, spit out JSON <paroneayea> oh I see. <paroneayea> okay that makes much more sense. <paroneayea> echoscript == take json, dump out json <paroneayea> sorry ;) <evanpro> dumpscript and buildscript are provided by the implementer to test the implementation [12:40] <evanpro> and there’s a test driver to run them <evanpro> so “testdriver dumpscript.py buildscript.py” <evanpro> Would run all the tests <evanpro> Or something like that <paroneayea> hm ok.... <paroneayea> evanpro: I don’t understand testdriver [12:41] <paroneayea> what does it do? <evanpro> Something like https://gist.github.com/evanp/5d80c0aa3f168465d84d

CANCELED dumpscript

<evanpro> dumpscript == take JSON, just spit out some extracted part of it

import activitystreams

json = parseCommandLineFileArgument()

activity = Activity.fromJSON(json)

print activity.type

<evanpro> <dumpscript> –activity-part=actor –part-property=id <filename>

<evanpro> <dumpscript> –activity-part=actor –part-property=id <filename> <evanpro> Those are crummy names but 🤷 <evanpro> That way implementers don’t have to write 50 different testing shims <paroneayea> evanpro: I hear you <paroneayea> evanpro: well, it may even be easier [12:36] <evanpro> It may also be worthwhile to have a producer test <paroneayea> –extract [“actor”][“@id”] <evanpro> That takes in some parameters and outputs some JSON <evanpro> Sure <evanpro> I’d be a little worried about defining a query language <evanpro> But yeah <paroneayea> evanpro: it’s probably equally complex to define a billion arguments <evanpro> So a producer script might take arguments like this <paroneayea> for the different components [12:37] <evanpro> agreed! <evanpro> <buildscript> –actor-id=urn:test:whatever –actor-name=”Evan Prodromou” –activity-type=”Like” –object-id=urn:test:whatever2 –object-name=”This terrible test” [12:38] <evanpro> But yeah pretty nightmarish

CANCELED buildscript

<evanpro> buildscript = take params, spit out JSON

CANCELED testdriver

<evanpro> so “testdriver dumpscript.py buildscript.py”

Hook up pyld

Brainstorm

Okay, so what do we want to do here?

  • Vocabularies might provide an “implied context”. That’s the biggest issue, because otherwise it can be inferred unambiguously from expanding the document.
  • Mostly, we might not want to re-read things?

This last one is a good goal but maybe we shouldn’t worry about it immediately.

Here’s the options from the JsonLdProcessor code:

class JsonLdProcessor(object):
    """
    A JSON-LD processor.
    """
    # [...]
    def expand(self, input_, options):
        """
        Performs JSON-LD expansion.

        :param input_: the JSON-LD input to expand.
        :param options: the options to use.
          [base] the base IRI to use.
          [expandContext] a context to expand with.
          [keepFreeFloatingNodes] True to keep free-floating nodes,
            False not to (default: False).
          [documentLoader(url)] the document loader
            (default: _default_document_loader).

        :return: the expanded JSON-LD output.
        """
  • we probably want to be able to set expandContext.
  • the documentLoader could thus possibly come with some context preloaded. But that’s kind of an optimization.

At least we know the two main steps now?

Update: It turns out the first of these is much simpler than we originally were thinking! There’s only one implied context in ActivityStreams, so we can hardcode the expandContext.

Handle the implied context

Should be passed into the environment, but possibly built out of the vocabulary.

cache things in the documentLoader

The documentLoader seems to just be a function accepting a URI, and raising JsonLdError if something goes badly.

{
    'contextUrl': None,
    'documentUrl': url,
    'document': data.decode('utf8')
}

So we could write a factory function that takes a mapping of {url: document}

def make_simple_loader(url_map, load_unknown_urls=True):
    def loader(url):
        # foo
        return loaded_url
    return loader

Provide a side-effect free environment option

Easily build expandContext and documentLoader based on supplied vocabulary?

One way or another we want to reduce the amount of data duplicated from the building of the Environment

Maybe rename types.py to core.py

Fix how ASType.__call__() handles long vs short URIs

ActivityStreams “classes”

Note that normal python classes can’t work here.

ASObj

Finish all those TODO methods
Archive
Construction: Do deep copy of asjson manually

This way we can catch any asobj types

Better inheritance order

We should do this like in the ANSI Common Lisp book, where we remove duplicates, but we remove duplictes but keep the last appearance of a “class”

Archive

Add inheritance / method dispatch system

This is trickier than one may think; we can’t do Python style method resolution because an activity may have multiple types.

Easy ASType->ASObj constructor interface

Something like:

from activipy import vocab

root_beer_note = vocab.Create(
    actor=vocab.Person(
        "http://tsyesika.co.uk",
        displayName="Jessica Tallon"),
    to=["acct:[email protected]"],
    object=vocab.Note(
        "http://tsyesika.co.uk/chat/sup-yo/",
        content="Up for some root beer floats?"))

This should be able to flow pretty naturally out of our types.py interface.

“environment” w/ method dispatch and object sugar

Brainstorm

So here’s how this thing works.

There’s an environment, which has a mapping between tuples of (method_symbol, Vocab) and method_to_call.

#                    method name    description    invocation method
save = Method("save", "Save things", handle_one)
gather_something = Method("gather_something", "Accrues some info", handle_map)

myenv = Enviroment(
   mapping={
       (save, Note): note_save,
       (save, Object): basic_save,
})

handle_one(myobj, save, db)

This way, using the inheritance_chain() method, we can handle various types of method handling:

  • handle_one
  • handle_map
  • handle_fold

However, we have enough metadata here to provide some sugar.

myenv = Environment(
  mapping={bla bla},
  vocab=vocab)

activity = Environment.c.Activity("http://oh/snap")
activity.m.save(db)
# or maybe even just activity.save()

This would have to mean that ASObj gets a method dispatch keyword option on construction, which might be a-ok.

I think this is a pretty good approach.

Add Environment and method dispatch

Add vocabulary + method-class sugar

Archive

Clean up method dispatch plan based on convo w/ steve
save_object = Method("save things", "handle_one")

myenv = Enviroment(
   mapping={
       (save_object, Note): note_save,
       })

handle_one(myobj, "save_object", db)
handle_one(myobj, save_object, db)

# more pythonic optional interface
# a bit leaky though
myenv = MetaEnviroment(
   mapping={
       (save_object, Note): note_save,
       }
    vocab=[BasicVocab]
)

myenv.Person("foo")
Person()

CANCELED Pass environment into methods?

Should methods be able to themselves take advantage of method dispatch? If so, they will need “env” as first argument.

Add ASObj.type_astype()

Brainstorm

Here’s the problem.

Assume we made an activity like this:

ROOT_BEER_NOTE_VOCAB = vocab.Create(
    "http://tsyesika.co.uk/act/foo-id-here/",
    actor=vocab.Person(
        "http://tsyesika.co.uk/",
        displayName="Jessica Tallon"),
    to=["acct:[email protected]",
        "acct:[email protected]"],
    object=vocab.Note(
        "htp://tsyesika.co.uk/chat/sup-yo/",
        content="Up for some root beer floats?"))

Now assume we made one like this:

ROOT_BEER_NOTE_JSOBJ = types.ASObj({
    "@type": "Create",
    "@id": "http://tsyesika.co.uk/act/foo-id-here/",
    "actor": {
        "@type": "Person",
        "@id": "http://tsyesika.co.uk/",
        "displayName": "Jessica Tallon"},
    "to": ["acct:[email protected]",
           "acct:[email protected]"],
    "object": {
        "@type": "Note",
        "@id": "htp://tsyesika.co.uk/chat/sup-yo/",
        "content": "Up for some root beer floats?"}})

Now even worse:

ROOT_BEER_NOTE_JSOBJ = types.ASObj({
    # AAAAAAAAAAA
    "@type": "http://www.w3.org/ns/activitystreams#Create", 
    "@id": "http://tsyesika.co.uk/act/foo-id-here/",
    "actor": {
        "@type": "Person",
        "@id": "http://tsyesika.co.uk/",
        "displayName": "Jessica Tallon"},
    "to": ["acct:[email protected]",
           "acct:[email protected]"],
    "object": {
        "@type": "Note",
        "@id": "htp://tsyesika.co.uk/chat/sup-yo/",
        "content": "Up for some root beer floats?"}})

So…

  • we really need to know about the whole set of vocabularies in order to do ASObj.type_astype()
  • Obviously, we also need to for method dispatch also
  • It could be then that we don’t load ASObj.vocab, but ASObj.env
  • Also, in general you can always do env.asobj_astypes(asobj)
  • Thus, we should also provide env.asobj_method(asobj, method_symbol)
  • Which means also, more obviously, and as a precedent, we must provide Environment.asobj_astype_chain(asobj)!

This also means that users should, in general, not use ASObj.type_astype(), unless they’re using the “sugar” edition which comes from supplying an environment.

We might want to also provide an expanded=True argument to some of those methods.

OR, maybe we can do “cheapest available” determination of an ASType.

What are the ways we might go about pulling down an ASType?

  • By short ID… but this requires this short ID be marked “safe” for short expansion
  • By already known URI
  • By json-ld examination (most expensive!)

Do we really want an expand=None? Maybe that’s kind of dumb

From short id

The question is, where do we mark whether its safe to consider the short_id as a safe representation from? Is it in the environment or in the vocab?

The vocab may make sense because we could do a shortids=load_from_vocabs((Vocab1, None), (GMGVocab, “gmg:”))

From known URI

By json-ld examination

Add JF2 context vocabulary

JF2 is the new MicroFormats json representation, but there’s a new verion that has a json-ld context. Add it as a vocabulary!

https://github.com/w3c-social/Social-Syntax-Brainstorming/wiki/jf2