Documentation

Scanner
in package

Application

Uses UtilsProvider

This isn't really a "cookie" scanner, it is a scanner which detects external URLs, scripts, iframes for the current page request. Additionally, it can automatically detect usable content blocker templates which we can recommend to the user.

Constants

HEADER_SITEMAP_FILTER = 'X-Sitemap-Crawler-Filter'
QUERY_ARG_FORCE_SITEMAP = 'rcb-force-sitemap': See `findByRobots.txt`: This simulates to be the current blog to be public so the `robots.txt` exposes a sitemap and also activates the sitemap.
QUERY_ARG_JOB_ID = 'rcb-scan-job'
QUERY_ARG_SITEMAP_FILTER = 'sitemap-crawler-filter'
QUERY_ARG_TOKEN = 'rcb-scan'
REAL_QUEUE_TYPE = 'rcb-scan'

Properties

$isActiveCache : mixed
$onChangeDetection : mixed
$query : mixed

Methods

addUrlsToQueue() : mixed: Add URLs to the queue so they get scanned.
force_blocker_enabled() : mixed: Force to enable the content blocker even when the content blocker is deactivated.
getCurrentSourceUrl() : mixed: Get the current source URL usable for a newly created `ScanEntry`. It gets escaped for database use with the help of `esc_url_raw`.
getOnChangeDetection() : mixed: Getter.
getPluginConstantPrefix() : string: Get the prefix of this plugin so composer packages can dynamically build other constant values on it.
getQuery() : mixed: Getter.
getRoleWithLeastCapabilities() : string|null: Get the role with the least capabilities.
getUserLoginUrls() : mixed: Get user login URLs like login, register and lost password.
instance() : mixed: New instance.
isActive() : mixed: Check if the current page request should be scanned.
outputBlogId() : mixed: Output a boolean flag if the current requested sitemap matches the current blog ID.
probablyForceSitemaps() : mixed: Force sitemaps in
probablyReduceCurrentUserPermissions() : mixed: Remove some capabilities and roles from the current user for the running page request.
purgeUnused() : mixed: Read a group of all known site URLs and delete them if they no longer exist in the passed URLs.
real_queue_error_description() : mixed: Get human-readable description for a RCB queue jobs.
real_queue_job_actions() : mixed: Get actions for RCB queue jobs.
real_queue_job_label() : mixed: Get human-readable label for RCB queue jobs.
resolve_blockables() : mixed: Add all known and non-disabled content blocker templates.
teardown() : mixed: All templates and external URLs got catched, let's persist them to database.
doActionAddedRemoved() : mixed: `do_action` when a result from the scanner got removed or added.
probablyInvalidateCacheAfterJobExecution() : mixed: Try to invalidate some caches after every scan process. The cache is invalidated in the following cases:
__construct() : mixed: C'tor.

HEADER_SITEMAP_FILTER


    public
        mixed
    HEADER_SITEMAP_FILTER
    = 'X-Sitemap-Crawler-Filter'

QUERY_ARG_FORCE_SITEMAP

See `findByRobots.txt`: This simulates to be the current blog to be public so the `robots.txt` exposes a sitemap and also activates the sitemap.


    public
        mixed
    QUERY_ARG_FORCE_SITEMAP
    = 'rcb-force-sitemap'

QUERY_ARG_JOB_ID


    public
        mixed
    QUERY_ARG_JOB_ID
    = 'rcb-scan-job'

QUERY_ARG_SITEMAP_FILTER


    public
        mixed
    QUERY_ARG_SITEMAP_FILTER
    = 'sitemap-crawler-filter'

QUERY_ARG_TOKEN


    public
        mixed
    QUERY_ARG_TOKEN
    = 'rcb-scan'

REAL_QUEUE_TYPE


    public
        mixed
    REAL_QUEUE_TYPE
    = 'rcb-scan'

$isActiveCache


    private
        mixed
    $isActiveCache
     = null

$onChangeDetection


    private
        mixed
    $onChangeDetection

$query


    private
        mixed
    $query

addUrlsToQueue()

Add URLs to the queue so they get scanned.


    public
                    addUrlsToQueue(array<string|int, string> $urls[, bool $purgeUnused = false ]) : mixed

Parameters

$urls : array<string|int, string>
$purgeUnused : bool = false: If true, the difference of the previous scanned URLs gets automatically purged if they do no longer exist in the URLs (pass only if you have the complete sitemap!)

force_blocker_enabled()

Force to enable the content blocker even when the content blocker is deactivated.


    public
                    force_blocker_enabled(bool $enabled) : mixed

Parameters

$enabled : bool

getCurrentSourceUrl()

Get the current source URL usable for a newly created `ScanEntry`. It gets escaped for database use with the help of `esc_url_raw`.


    public
            static        getCurrentSourceUrl() : mixed

getOnChangeDetection()

Getter.


    public
                    getOnChangeDetection() : mixed

getPluginConstantPrefix()

Get the prefix of this plugin so composer packages can dynamically build other constant values on it.


    public
                    getPluginConstantPrefix() : string

Return values

string

getQuery()

Getter.


    public
                    getQuery() : mixed

getRoleWithLeastCapabilities()

Get the role with the least capabilities.


    public
            static        getRoleWithLeastCapabilities() : string|null

Return values

string|null

getUserLoginUrls()

Get user login URLs like login, register and lost password.


    public
                    getUserLoginUrls() : mixed

instance()

New instance.


    public
            static        instance() : mixed

isActive()

Check if the current page request should be scanned.


    public
                    isActive() : mixed

outputBlogId()

Output a boolean flag if the current requested sitemap matches the current blog ID.


    public
                    outputBlogId() : mixed

This is currently only enabled for logged-in users and useful in a multisite scenario with path based subsites. Example robots.txt:

User-agent: *
Allow: /
Sitemap: https://example.com/de/wp-sitemap.xml
Sitemap: https://example.com/en/wp-sitemap.xml

When we start the scan process on the /de subsite, we are not allowed to access the https://example.com/en/wp-sitemap.xml URL. This header helps us to identify the correct blog ID in this case.

probablyForceSitemaps()

Force sitemaps in


    public
                    probablyForceSitemaps() : mixed

probablyReduceCurrentUserPermissions()

Remove some capabilities and roles from the current user for the running page request.


    public
                    probablyReduceCurrentUserPermissions() : mixed

For example, some Google Analytics plugins do only print out the analytics code when not manage_options (e.g. WooCommerce Google Analytics).

purgeUnused()

Read a group of all known site URLs and delete them if they no longer exist in the passed URLs.


    public
                    purgeUnused(array<string|int, string> $urls) : mixed

Parameters

$urls : array<string|int, string>

real_queue_error_description()

Get human-readable description for a RCB queue jobs.


    public
                    real_queue_error_description(string $description, string $type, array<string|int, int> $remaining) : mixed

Parameters

$description : string
$type : string
$remaining : array<string|int, int>

real_queue_job_actions()

Get actions for RCB queue jobs.


    public
                    real_queue_job_actions(array<string|int, array<string|int, mixed>> $actions, string $type) : mixed

Parameters

$actions : array<string|int, array<string|int, mixed>>
$type : string

real_queue_job_label()

Get human-readable label for RCB queue jobs.


    public
                    real_queue_job_label(string $label, string $originalType) : mixed

Parameters

$label : string
$originalType : string

resolve_blockables()

Add all known and non-disabled content blocker templates.


    public
                    resolve_blockables(array<string|int, AbstractBlockable> $blockables, HeadlessContentBlocker $headlessContentBlocker) : mixed

Parameters

$blockables : array<string|int, AbstractBlockable>
$headlessContentBlocker : HeadlessContentBlocker

teardown()

All templates and external URLs got catched, let's persist them to database.


    public
                    teardown() : mixed

doActionAddedRemoved()

`do_action` when a result from the scanner got removed or added.


    protected
                    doActionAddedRemoved(array<string|int, ScanEntry> $results, array<string|int, string> $beforeTemplates, array<string|int, string> $beforeExternalHosts, array<string|int, string> $afterTemplates, array<string|int, string> $afterExternalHosts) : mixed

Parameters

$results : array<string|int, ScanEntry>
$beforeTemplates : array<string|int, string>
$beforeExternalHosts : array<string|int, string>
$afterTemplates : array<string|int, string>
$afterExternalHosts : array<string|int, string>

probablyInvalidateCacheAfterJobExecution()

Try to invalidate some caches after every scan process. The cache is invalidated in the following cases:


    protected
                    probablyInvalidateCacheAfterJobExecution(Job $job) : mixed

Scan process was done without job (e.g. calling website with ?rcb-scan)
Scan process was done for a single website (e.g. after saving a post)
Scan process is within a complete website then, then the following situation is implemented: Depending on the count of scanned entries, every x. job within the complete website scan invalidated the cache.

Parameters

$job : Job

__construct()

C'tor.


    private
                    __construct() : mixed

Scanner in package Application Uses UtilsProvider

Table of Contents

Constants

Properties

Methods

Constants

HEADER_SITEMAP_FILTER

QUERY_ARG_FORCE_SITEMAP

QUERY_ARG_JOB_ID

QUERY_ARG_SITEMAP_FILTER

QUERY_ARG_TOKEN

REAL_QUEUE_TYPE

Properties

$isActiveCache

$onChangeDetection

$query

Methods

addUrlsToQueue()

Parameters

force_blocker_enabled()

Parameters

getCurrentSourceUrl()

getOnChangeDetection()

Tags

getPluginConstantPrefix()

Tags

Return values

getQuery()

Tags

getRoleWithLeastCapabilities()

Return values

getUserLoginUrls()

instance()

Tags

isActive()

outputBlogId()

probablyForceSitemaps()

probablyReduceCurrentUserPermissions()

Tags

purgeUnused()

Parameters

real_queue_error_description()

Parameters

real_queue_job_actions()

Parameters

real_queue_job_label()

Parameters

resolve_blockables()

Parameters

teardown()

doActionAddedRemoved()

Parameters

probablyInvalidateCacheAfterJobExecution()

Parameters

__construct()

Tags

Scanner
in package

Application

Uses UtilsProvider