Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define RFC 4078 (CRID) spec #1

Closed
8 tasks done
Tracked by #144
hairmare opened this issue Jan 8, 2022 · 7 comments
Closed
8 tasks done
Tracked by #144

Define RFC 4078 (CRID) spec #1

hairmare opened this issue Jan 8, 2022 · 7 comments

Comments

@hairmare
Copy link
Member

hairmare commented Jan 8, 2022

RFC 0878 defines the The TV-Anytime Content Reference Identifier (CRID) Uniform Resource Locator (URL) scheme (crid:). CRID URLs are references to current or future scheduled publications of broadcast media content over television and radio distribution platforms and the Internet.

A CRID URL takes the form

crid://<DNS name>/<data>

The aim of this spec is to document our use of rabe.ch for the "DNS name" part. It SHALL also define the "data" part in a normative way and a registry for the parts of "data" that benefit from further specification. The spec will be managed in a fashion similar to RaBe CloudEvents.

This spec SHALL allow us to build a CRID authority as defined in ETSI TS 102 822-4 and according to the XSD published in TS 102 822-3-1.

Given that we don't implement low-level DAB/DVB transports or PDRs on our own, the main focus on our usage or CRIDs and resolving will be focused on bi-directional (tcp/ip based) CRID resolution.

The autority will be discoverable use DNS SRV records: _lres._tcp.rabe.ch. Summarised from ETSI TS 102 822-3-1 the authority does the following:

  • a request to http://<Path_to_server_script>?CRID=value1&CRID=value2 containg multiple CIRDs to look up is sent
  • authority returns either the results of the lookup, a redirect to the results (for distribution/lb purposes) or, a Resolving Authority Record (RAR) in case resolving needs to happen elsewhere.
  • the returned results are text/xml and may be one of the following elements from the XSD: GroupInformationTable, ProgramInformationTable, ProgramLocationTable, ContentReferencingTable

The definition of data is expected to match the elements our CRID authority supports with the aims of making resolving
as seamless as possible.

At the time of writing i have not been able to find an open source implementation of a CRID authority (nor a proprietary one for that matter). Any existing implementations should likely govern the definition of this spec.

Further information is potentially in ETSI TS 102 822-6-2 which defines UDDI and WSDL. While these infos are specific to using SOAP, chances are high that we would prefer not to do so (rather we'd stick with embedding metadata in the authority response described above).

ETSI TS 102 822-8 describes how classic smart tv/dvb CRID/metadata can be interchanged to be acessible from the web. In our web-first case we will expose the underlying data directly w/o catering to any kind of smart tv/radio devices (this would be the responsability of any low-level service provider from our pov).

Tasks

  • find a sensible way to implement a CRID authority/resolver
  • figure out how RaBe CloudEvents fit into the CRID picture (ie for CRID collecting via event bus)
    • no authority means we do not need this
  • draw sketch of architecture with focus on datasources involved with CRID resolving
    • no authority means we do not need this
  • decide if data part of crid starts with versioning info (ie. crid://rabe.ch/<version>/<data_content> where version = v1, v1alpha, v1beta etc) Define RFC 4078 (CRID) spec #1 (comment) (yes!)
  • compile list of example crids we would expect to see
  • define data_content part of data part of crid (using existing/common URL semantics)
  • is it possible to use originator defined content for parts of the crid data
    • this is in theory possible but doesn't make sense in the larger context of crids
    • without an authority/resolver we'll work with ad-hoc crids which are similar to originator defined ones for our purpose (which is to generate some radiodns but not more for now)
  • write spec

Links

@hairmare
Copy link
Member Author

find a sensible way to implement a CRID authority/resolver

When we start using CRID, it is very unlikely that we will be implementing a proper CRID authority as per ETSI TS 102 822-3-1 .

For one, no opensource implementations of such an authority seem to be readily available.

The spec also does not align well with us wanting to stop using XML and switching to more modern standards like some form of JSON (see radiorabe/nowplaying#128 for replacing the current XML output, more issues for other legacy xml formats exist as well).

Further searching on github for anything related has turned up tvheadend/tvheadend#315 which indicates that there is some merit in using a CRID based system without providing an "proper" CRID authority.

All in all, the decision to not keep implementing a CRID authority in spec for now will result in the first iteration of our cird-spec having to be focused on the concept of originator defined content as the baseline of our crid spec. In essence would mean that will be namespacing the data contents in a way that a decentralized authority/algorithm can be used to generate PI for CRIDs specific to a show or track.

An alternative worth exploring might be to consider implementing a RESTful alternative to the authority standard based on JSON (and possibly with lookup support). This isn't very attractive given that we don't want to have such code in scope for now (and if possibly never at all).

For now i'll try to look into possibilities and risks of going down the originator defined route and circle back to this if i'm either bored enough to write a oss crid authority or/and the originator defined approach doesn't end up being feasible.

@hairmare
Copy link
Member Author

decide if data part of crid starts with versioning info (ie. crid://rabe.ch/<version>/<data_content> where version = v1, v1alpha, v1beta etc)

this one is easy, we will be using k8s api version style versions in the data parts of the crid and stick to the examples given:

  • crid://rabe.ch/v1alpha/<data_content>
  • crid://rabe.ch/v1alpha1/<data_content>
  • crid://rabe.ch/v1beta/<data_content>
  • crid://rabe.ch/v1rc/<data_content>
  • crid://rabe.ch/v1/<data_content>

I don't expect us to define all the versions, we'll want to start off with v1alpha and introduce v1 as soon as we have something working that we can rely on.

@hairmare
Copy link
Member Author

hairmare commented Jan 21, 2022

I've been looking at other implementations, specifically PEACH and BBC PIPs, to see if i can find any concrete, radio oriented implementations of CRID to no avail. I still need to investigate odr-radioepg-bridge and it's dependencies to see if i can find additional info.

edit: even finding a reference to crids in python-hybridspi feels like a win at this point. it doesn't help wrt how to originate crids though.

@hairmare
Copy link
Member Author

Example CRIDs

These example CRIDs are base on some real world examples. Currently parts of our automation already uses human-readable show to match things. I'm proposing we stick with that and use human readable strings going forward.

Show Description Web URL CRID
Der Morgen (Freitag) morning show, is a different show for each day of the week, does not have repeats and no individual episodes on the website https://rabe.ch/der-morgen-freitag/ crid://rabe.ch/v1/der-morgen-freitag
RaBe Info news, different on each day of the week, gets repeated once on air and published as podcast, has a page per episode on the web site https://rabe.ch/info/ crid://rabe.ch/v1/info
Klangbecken Always on show, gets used as a filler if nothing is scheduled and has it's own schedule, no repeats, no episodes https://rabe.ch/klangbecken/ crid://rabe.ch/v1/klangbecken

Most of our shows are somewhere between how Info and how Der Morgen (Freitag) works. So in some cases there is a concept of episodes, in other cases there is not (the monring shows are really that always happening don't really have eps). This is reflected on the website where Info will have a page per ep as well as their show page while Der Morgen only has a show page. From the website pov shows can switch freely between both of these modes (hence the linked morgen having some old eps from 2018).

From a crid standpoint it makes sense to simplifty things. For a start we won't be defining CRIDs for episodes but that just use the shows crid instead (with optional time info in a local part using #). Should we decide to encode more infomation into crids at a later stage we can either define v2 or stick ep info onto the path (ie./v1/info/2022-01-21).

@hairmare
Copy link
Member Author

hairmare commented Jan 21, 2022

abnf:

crid          =   "crid://rabe.ch/" version "/" data-content
version       =   "v" 1*DIGIT [ pre-release ]          ; ie. v1, v2,
pre-release   =   ( "alpha" / "beta" / "rc" ) *DIGIT   ; v1alpha, v1alpha1, v1beta, ...

data-content  =   show-name [ "#" media-frags ]
show-name     =   1*ALPHA     ; show name string derived from website
media-frags   =   utc-range   ; based on https://www.w3.org/TR/media-frags/

utc-range     =   "t=clock:" utc-date-time "-" [ utc-date-time ]
utc-date-time =   utc-date "T" utc-time "Z"
utc-date      =   8DIGIT                    ; < YYYYMMDD >
utc-time      =   6DIGIT [ "." fraction ]   ; < HHMMSS.fraction >
fraction      =   1*2DIGIT                  ; 0-99

@hairmare
Copy link
Member Author

hairmare commented Feb 8, 2022

The media-frags bit needs some more elaboration. We want to use real time world clock timestamps to ensure that the URL are specific to add specific date and time on the gregorian calendar. The current w3c recommendation tells us that Temporal clipping is denoted by the name t

Temporal clipping is specified as Normal Play Time (npt) RFC 2326. It can also be specified as SMPTE timecodes SMPTE or as real-world clock time (clock) RFC 2326 in the advanced version described in the Media Fragments 1.0 URI (advanced) document. Begin and end times are always specified in the same format. The format is specified by name, followed by a colon (:), with npt: being the default. In this version of the media fragments specification there is no extensibility mechanism to add time format specifiers.

We want to use RFC 2326, in the section "10.5 PLAY" (page 34) it has this bit:

For playing back a recording of a live presentation, it may be desirable to use clock units:

     C->S: PLAY rtsp://audio.example.com/meeting.en RTSP/1.0
           CSeq: 835
           Session: 12345678
           Range: clock=19961108T142300Z-19961108T143520Z

In the example a user agent is querying a media server and using a media fragment as part of the range request. This isn't our use-case but we can piggy back on the definition given for using clock as the time format specifier. RFC 2326 specifies this as utc-range in ebnf (page 17, 3.7 Absolute Time):

utc-range    =   "clock" "=" utc-time "-" [ utc-time ]
utc-time     =   utc-date "T" utc-time "Z"
utc-date     =   8DIGIT                    ; < YYYYMMDD >
utc-time     =   6DIGIT [ "." fraction ]   ; < HHMMSS.fraction >

Example for November 8, 1996 at 14h37 and 20 and a quarter seconds UTC:

19961108T143720.25Z

Given this our CRID for a specific point (like the 8. Nov '96) in Klangbeckens history would look like this (we use an adapted utc-range to use : since we are using the format in a the fragment part:

crid://rabe.ch/v1/klangbecken#t=clock:19961108T143720.25Z

This allows referencing a single temporal point of rabe (we reference the actual "physical" broadcast here, not some http based representation). To point to a time range (say to encode duration info for todays Info) we can add an end time:

crid://rabe.ch/v1/info#t=clock:20220209T120000Z-20220209T123000Z

The end time is optional because it's not important for a lot of use-cases (ie. if we use crids as id for RaBe CloudEvents then we really just want them to distinctivly reference the moment the event is about for uniqueness sake)

There is also some potential in finding/creating a helper lib to help reason about the media fragments. Such a lib could help converting from world clock to other formats like duration in seconds or laguage native representations. it could also implement comparison operators to help reason about crids answering questions like "is this crid nested in another crid?" or "is this show still on air right now?")

@hairmare
Copy link
Member Author

hairmare commented Dec 1, 2022

Fixed in 38fc72c

@hairmare hairmare closed this as completed Dec 1, 2022
Repository owner moved this from In Progress to Done in songticker Dec 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

1 participant