Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NGSTACK-843 page indexing implementation from fina + FieldMapper impl… #10

Open
wants to merge 48 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
152f1c7
NGSTACK-843 page indexing implementation from fina + FieldMapper impl…
Apr 15, 2024
33a0839
NGSTACK-843 improved configuration builder and added tests for page i…
Apr 19, 2024
214f60e
NGSTACK-843 fix service definition
Apr 22, 2024
65e3818
NGSTACK-843 add router stub for tests
Apr 22, 2024
6e0bcf9
NGSTACK-843 move RouterStub to tests/lib
Apr 22, 2024
ac162c6
NGSTACK-843 fix refactor mistake wrong file loaded
Apr 22, 2024
b8b5f64
NGSTACK-843 add missing parameters for tests
Apr 22, 2024
a6f1d46
NGSTACK-843 fix typo
Apr 22, 2024
ccfacdb
NGSTACK-843 use Symfony HttpClient instead of curl
Apr 22, 2024
5632762
NGSTACK-843 throw RuntimeException if RouterStub methods are used
Apr 22, 2024
497a329
NGSTACK-843 renaming files and seperate document factory and page ind…
Apr 25, 2024
3a5d0ef
NGSTACK-843 use page_indexing.enabled instead of use_page_indexing pa…
Apr 25, 2024
6f3976b
NGSTACK-843 configuration builder modifications
Apr 25, 2024
6f13134
NGSTACK-843 fix field mapper docs
Apr 25, 2024
e7589b8
NGSTACK-843 add renamed compiler passes
Apr 25, 2024
28c2082
NGSTACK-843 command improvements
Apr 25, 2024
e497dda
NGSTACK-843 PageTextExtractor fixes
Apr 25, 2024
fe380a7
NGSTACK-843 remove unused use statements
Apr 25, 2024
4c9a90d
NGSTACK-843 add missing use statement
Apr 25, 2024
f9ce3aa
NGSTACK-843 introducing abstract PageTextIndexer
Apr 25, 2024
6dc18e4
NGSTACK-843 add new line at the end of file
Apr 25, 2024
bca6ae4
NGSTACK-843 configuration modifications
May 9, 2024
8849dbf
NGSTACK-843 fix test configuration
May 9, 2024
890d577
NGSTACK-843 move getSiteConfigForContent method to new service SiteAc…
May 9, 2024
c9cb281
NGSTACK-843 solr FieldMapper fix
May 9, 2024
aa257c2
NGSTACK-843 remove not used service and use statement
May 9, 2024
9e2818f
NGSTACK-843 fix configuration info messages
May 9, 2024
d7b6022
NGSTACK-843 composer.json remove config block
May 9, 2024
8b29ba1
NGSTACK-843
May 9, 2024
4f2964b
NGSTACK-843 remove unnecessary code
May 9, 2024
dd4a7eb
NGSTACK-843 set content-ids as optional
May 9, 2024
453cc61
NGSTACK-843 move command to bundle
May 9, 2024
2db38c7
NGSTACK-843 indexContent method change return type to void
May 9, 2024
f5e28b1
NGSTACK-843 fix command service definition
May 9, 2024
c072b6a
NGSTACK-843 output info about site that is processed
May 9, 2024
cc55f5a
NGSTACK-843 renamed PageTextExtractor to NativePageTextExtractor
May 9, 2024
189be79
NGSTACK-843 moved siteConfig resolver line below cache checks
May 9, 2024
3d15cd0
NGSTACK-843 rename SiteAccessConfigResolver to SiteConfigResolver
May 9, 2024
451b901
NGSTACK-843 remove not used variable from method signature
May 9, 2024
4fd0001
NGSTACK-843 remove unnecessary code
May 9, 2024
e4db8b1
NGSTACK-843
May 10, 2024
db86bac
NGSTACK-843 use symfony style in command
May 10, 2024
f7258c9
NGSTACK-843 remove unused lines
May 10, 2024
e5248c7
NGSTACK-843 php-cs-fixer apply rules from site api and run on edited …
May 10, 2024
94529a4
NGSTACK-843 add documentation for page indexing
Aug 7, 2024
3e8626b
NGSTACK-843 fix
Aug 7, 2024
fb6f609
NGSTACK-843 fix documentation formating
Sep 10, 2024
665b705
NGSTACK-843 remove document factory from the documentation (seperated…
Sep 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions bundle/DependencyInjection/Configuration.php
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

namespace Netgen\Bundle\IbexaSearchExtraBundle\DependencyInjection;

use Ibexa\Contracts\Core\Repository\LanguageService;
pspanja marked this conversation as resolved.
Show resolved Hide resolved
use Symfony\Component\Config\Definition\Builder\ArrayNodeDefinition;
use Symfony\Component\Config\Definition\Builder\TreeBuilder;
use Symfony\Component\Config\Definition\ConfigurationInterface;
Expand All @@ -25,6 +26,7 @@ public function getConfigTreeBuilder(): TreeBuilder
$this->addIndexableFieldTypeSection($rootNode);
$this->addSearchResultExtractorSection($rootNode);
$this->addAsynchronousIndexingSection($rootNode);
$this->addPageIndexingSection($rootNode);

return $treeBuilder;
}
Expand Down Expand Up @@ -73,4 +75,94 @@ private function addAsynchronousIndexingSection(ArrayNodeDefinition $nodeDefinit
->end()
->end();
}

private function addPageIndexingSection(ArrayNodeDefinition $nodeDefinition): void
{
$keyValidator = static function ($v) {
foreach (array_keys($v) as $key) {
if (!is_string($key)) {
return true;
}
}
return false;
};
$nodeDefinition
->children()
->arrayNode('page_indexing')
->addDefaultsIfNotSet()
->info('Page indexing configuration')
->children()
->booleanNode('enabled')
->info('Use layouts page text indexing')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Use page indexing".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->defaultFalse()
->end()
->arrayNode('sites')
->useAttributeAsKey('name')
->normalizeKeys(false)
->validate()
->ifTrue($keyValidator)
->thenInvalid('Site root name must be of string type.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Site name must be of string type".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->end()
->arrayPrototype()
->children()
->integerNode('tree_root_location_id')
->info('Site root id')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Site root Location ID".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->beforeNormalization()->always(static fn ($v) => is_string($v) ? (int)$v : $v)->end()
->end()
->arrayNode('languages_siteaccess_map')
->info('Language key mapped to page siteaccess')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Language code mapped to page siteaccess".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->useAttributeAsKey('name')
->normalizeKeys(false)
->validate()
->ifTrue($keyValidator)
->thenInvalid('Page name must be of string type.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Language code must be of string type".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->end()
->scalarPrototype()
->validate()
->ifTrue(static fn ($v) => !is_string($v))
->thenInvalid('Siteaccess name must be of string type.')
->end()
->end()
->end()
->arrayNode('fields')
->info('Config for separating page text by importance of the html tags and classes, used to index content to separate solr fields')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Mapping of indexed field names to an array of HTML tag selectors".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->validate()
->ifTrue($keyValidator)
->thenInvalid('Array key (level of field importance) must be of string type.')
Copy link
Member

@pspanja pspanja May 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Indexed field name must be of string type".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 9e2818f

->end()
->arrayPrototype()
->useAttributeAsKey('name')
->normalizeKeys(false)
->scalarPrototype()
->validate()
->ifTrue(static fn ($v) => !is_string($v))
->thenInvalid('HTML selector must be of string type.')
->end()
->end()
->end()
->end()
->arrayNode('allowed_content_types')
->info('Content types to index')
->useAttributeAsKey('name')
->normalizeKeys(false)
->scalarPrototype()
->validate()
->ifTrue(static fn ($v) => !is_string($v))
->thenInvalid('Content type identifier must be of string type.')
->end()
->end()
->end()
->scalarNode('host')
->info('Host to index page from, defined in .env files')
->validate()
->ifTrue(static fn ($v) => !is_string($v))
->thenInvalid('Host must be of string type.')
->end()
->end()
->end()
->end()
->end()
->end();
}

}
71 changes: 69 additions & 2 deletions bundle/DependencyInjection/NetgenIbexaSearchExtraExtension.php
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,13 @@

class NetgenIbexaSearchExtraExtension extends Extension implements PrependExtensionInterface
{
private static array $defaultConfiguration = [
'tree_root_location_id' => null,
'languages_siteaccess_map' => [],
'host' => null,
'fields' => [],
'allowed_content_types' => []
];
public function getAlias(): string
{
return 'netgen_ibexa_search_extra';
Expand Down Expand Up @@ -82,12 +89,11 @@ private function loadBundleSolrEngine(ContainerBuilder $container): void
private function processExtensionConfiguration(array $configs, ContainerBuilder $container): void
{
$configuration = $this->getConfiguration($configs, $container);

$configuration = $this->processConfiguration($configuration, $configs);

$this->processIndexableFieldTypeConfiguration($configuration, $container);
$this->processSearchResultExtractorConfiguration($configuration, $container);
$this->processAsynchronousIndexingConfiguration($configuration, $container);
$this->processPageIndexingConfiguration($configuration, $container);
}

private function processSearchResultExtractorConfiguration(array $configuration, ContainerBuilder $container): void
Expand Down Expand Up @@ -117,4 +123,65 @@ private function processAsynchronousIndexingConfiguration(array $configuration,
$configuration['use_asynchronous_indexing'],
);
}

private function processPageIndexingConfiguration(array $configuration, ContainerBuilder $container): void
{
$container->setParameter(
'netgen_ibexa_search_extra.page_indexing.sites',
$configuration['page_indexing']['sites'] ?? [],
);

$container->setParameter(
'netgen_ibexa_search_extra.page_indexing.enabled',
$configuration['page_indexing']['enabled'] ?? false,
);

if (!$configuration['page_indexing']['enabled']) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed.

return;
}

if ($configuration['page_indexing']['sites'] === []) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we really need this, empty array as default value should be OK, or?

$container->setParameter(
'netgen_ibexa_search_extra.page_indexing.sites',
[
'default' => self::$defaultConfiguration
]
);
return;
}
foreach ($container->getParameter('netgen_ibexa_search_extra.page_indexing.sites') as $siteName => $config) {
$this->setPageIndexingSitesParameters($configuration, $container, $siteName);
}
}

private function setPageIndexingSitesParameters(array $configuration, ContainerBuilder $container, string $siteName): void
{
/** @var array $pageIndexingSitesConfig */
$pageIndexingSitesConfig = $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites');

if (!array_key_exists('tree_root_location_id', $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites')[$siteName])) {
$pageIndexingSitesConfig[$siteName]['tree_root_location_id'] = null;
}

if (!array_key_exists('languages_siteaccess_map', $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites')[$siteName])) {
$pageIndexingSitesConfig[$siteName]['languages_siteaccess_map'] = [];
}

if (!array_key_exists('host', $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites')[$siteName])) {
$pageIndexingSitesConfig[$siteName]['host'] = null;
}

if (!array_key_exists('fields', $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites')[$siteName])) {
$pageIndexingSitesConfig[$siteName]['fields'] = [];
}

if (!array_key_exists('allowed_content_types', $container->getParameter('netgen_ibexa_search_extra.page_indexing.sites')[$siteName])) {
$pageIndexingSitesConfig[$siteName]['allowed_content_types'] = [];
}

$container->setParameter(
'netgen_ibexa_search_extra.page_indexing.sites',
$pageIndexingSitesConfig,
);
}
}
2 changes: 2 additions & 0 deletions bundle/NetgenIbexaSearchExtraBundle.php
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,7 @@ public function build(ContainerBuilder $container): void
$container->addCompilerPass(new Compiler\FieldType\RichTextIndexablePass());
$container->addCompilerPass(new Compiler\SearchResultExtractorPass());
$container->addCompilerPass(new Compiler\RawFacetBuilderDomainVisitorPass());
$container->addCompilerPass(new Compiler\PageIndexingPass());
$container->addCompilerPass(new Compiler\ElasticsearchExtensibleDocumentFactoryPass());
}
}
12 changes: 10 additions & 2 deletions composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@
"ext-dom": "*",
"ibexa/core": "^4.6",
"symfony/messenger": "^5.4",
"symfony/proxy-manager-bridge": "^5.4"
"symfony/proxy-manager-bridge": "^5.4",
"ext-libxml": "*",
"ext-curl": "*"
},
"require-dev": {
"ibexa/fieldtype-richtext": "^4.5",
Expand All @@ -30,7 +32,8 @@
},
"suggest": {
"netgen/ibexa-site-api": "Boost your site-building productivity with Ibexa CMS",
"ibexa/solr": "Supports advanced capabilities with Ibexa search API"
"ibexa/solr": "Supports advanced capabilities with Ibexa search API",
"ibexa/elasticsearch": "Needed for layouts indexer"
pspanja marked this conversation as resolved.
Show resolved Hide resolved
},
"autoload": {
"psr-4": {
Expand All @@ -51,5 +54,10 @@
},
"scripts": {
"test": "@php vendor/bin/phpunit --colors=always"
},
"config": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed, it will be added again after install, but we need to ignore it.

"allow-plugins": {
"php-http/discovery": false
}
}
}
Loading
Loading