-
Notifications
You must be signed in to change notification settings - Fork 246
Facet Indexing and Query
Drew Farris edited this page Sep 3, 2020
·
6 revisions
Work in Progress
Table | Purpose | Row | Column Family | Column Qualifier | Value |
---|---|---|---|---|---|
facets | Holds cardinality for pivot/facet pairs | pivot field value \0 facet field value \0 datatype | pivot field name \0 facet field name | datestamp | Serialized HyperLogLog Plus |
Cardinality hash reference | pivot field value \0 facet hash \0 datatype | pivot field name \0 facet field name '.hash' | datestamp | Serialized HyperLogLog Plus | |
facetHashes | Hashes for fields with a large number values | field value hash | field value | none | none |
facetMetadata | Tracks pivot/facet pairs | pivot field name \0 pivot field value | 'pv' | none | none |
The ingest configuration for the myjson
datatype (myjson-ingest-config.xml
) contains the configuration directive:
<property>
<name>myjson.facet.category.name.network</name>
<value>NETWORK_NAME;GENRES,EMBEDDED_CAST_PERSON_GENDER,RATING_AVERAGE</value>
</property>
The facets
table entries looks like (with values omitted..)
bbc one\x00female\x00myjson NETWORK_NAME\x00EMBEDDED_CAST_PERSON_GENDER:20200831 [PRIVATE|(BAR&FOO)]
bbc one\x00male\x00myjson NETWORK_NAME\x00EMBEDDED_CAST_PERSON_GENDER:20200831 [PRIVATE|(BAR&FOO)]
bbc one\x00romance\x00myjson NETWORK_NAME\x00GENRES:20200831 [PRIVATE|(BAR&FOO)]
cbs\x005.7\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200707 [PRIVATE|(BAR&FOO)]
cbs\x005.8\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200707 [PRIVATE|(BAR&FOO)]
cbs\x008.2\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200707 [PRIVATE|(BAR&FOO)]
cbs\x008.2\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200831 [PRIVATE|(BAR&FOO)]
cbs\x008.6\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200831 [PRIVATE|(BAR&FOO)]
cbs\x008.8\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200831 [PRIVATE|(BAR&FOO)]
cbs\x009\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00action\x00myjson NETWORK_NAME\x00GENRES:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00cbs\x00myjson NETWORK_NAME\x00NETWORK_NAME:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00cbs\x00myjson NETWORK_NAME\x00NETWORK_NAME:20200831 [PRIVATE|(BAR&FOO)]
cbs\x00comedy\x00myjson NETWORK_NAME\x00GENRES:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00comedy\x00myjson NETWORK_NAME\x00GENRES:20200831 [PRIVATE|(BAR&FOO)]
cbs\x00crime\x00myjson NETWORK_NAME\x00GENRES:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00drama\x00myjson NETWORK_NAME\x00GENRES:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00family\x00myjson NETWORK_NAME\x00GENRES:20200707 [PRIVATE|(BAR&FOO)]
cbs\x00female\x00myjson NETWORK_NAME\x00EMBEDDED_CAST_PERSON_GENDER:20200831 [PRIVATE|(BAR&FOO)]
cbs\x00male\x00myjson NETWORK_NAME\x00EMBEDDED_CAST_PERSON_GENDER:20200831 [PRIVATE|(BAR&FOO)]
cbs\x00medical\x00myjson NETWORK_NAME\x00GENRES:20200831 [PRIVATE|(BAR&FOO)]
cbs\x00war\x00myjson NETWORK_NAME\x00GENRES:20200831 [PRIVATE|(BAR&FOO)]
fox\x007.2\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200707 [PRIVATE|(BAR&FOO)]
fox\x007.8\x00myjson NETWORK_NAME\x00RATING_AVERAGE:20200831 [PRIVATE|(BAR&FOO)]
The facetMetadata
table records which pivot/facet field pairs we have seen:
NETWORK_NAME\x00EMBEDDED_CAST_PERSON_GENDER pv: []
NETWORK_NAME\x00GENRES pv: []
NETWORK_NAME\x00NETWORK_NAME pv: []
NETWORK_NAME\x00RATING_AVERAGE pv: []
The facetHashes
table holds the one-to-many mapping between a field value hash and the values for that field.
085a7d11b23a6367b8ad http://static.tvmaze.com/uploads/images/medium_portrait/0/1116.jpg: []
085a7d11b23a6367b8ad http://static.tvmaze.com/uploads/images/medium_portrait/0/1117.jpg: []
085a7d11b23a6367b8ad http://static.tvmaze.com/uploads/images/medium_portrait/0/516.jpg: []
085a7d11b23a6367b8ad http://static.tvmaze.com/uploads/images/medium_portrait/0/517.jpg: []