BotDetector
in package
BotDetector
- Loads an optional is_bot.php (require_once) into a static slot (only once).
- Optionally (when $extends === true) attempts to load CrawlerDetect via Composer and keep an instance in a static property (only once).
- Primary detection order:
- is_bot() (function from is_bot.php) -> if returns truthy, treat as bot
- if extends enabled and CrawlerDetect available -> use it
- fallback UA heuristics (simple)
Comments in English inside code.
Table of Contents
Properties
- $crawlerDetectInstance : object|null
- $crawlerDetectTried : bool
- $extends : bool
- $isBotCallable : callable|null
- $isBotFilePath : string|null
- $isBotFileTried : bool
- $logger : T4LOG
- $userAgent : string
Methods
- __construct() : mixed
- Constructor.
- debugState() : array<string, mixed>
- Return a debug array showing current internal state (safe for logging).
- detectBotName() : string|null
- Try to detect the bot's canonical name (if any).
- getIsBotFilePath() : string|null
- Return the path of the loaded is_bot file, or null.
- getUserAgent() : string
- Get the user agent string for the current request
- isBot() : bool
- Primary method: is this UA a bot?
- isCrawlerDetectAvailable() : bool
- Returns whether CrawlerDetect instance is available.
- isHuman() : bool
- Convenience alias: not bot => human
- isIsBotFileLoaded() : bool
- Returns true if is_bot() callable is available (loaded).
- preferredDetector() : string
- Return which detector would be used first (string)
- refreshLoads() : void
- Force attempt to (re)load CrawlerDetect and/or is_bot file.
- setUserAgent() : void
- Change user agent for this detector and optionally refresh loaders.
- toArray() : array<string, mixed>
- Return a debug array showing current internal state (safe for logging).
- heuristicIsBot() : bool
- Heuristic check for bots: search for common bot tokens.
- loadCrawlerDetectIfNeeded() : bool
- Attempt to autoload Composer (searches typical vendor/autoload.php locations) and instantiate CrawlerDetect.
- loadIsBotFile() : bool
- Safely require_once an is_bot.php file and detect a callable function.
Properties
$crawlerDetectInstance
private
static object|null
$crawlerDetectInstance
= null
Static instance of CrawlerDetect if available (actual type depends on package)
$crawlerDetectTried
private
static bool
$crawlerDetectTried
= false
Whether CrawlerDetect load was attempted
$extends
private
bool
$extends
Whether this instance will attempt to use CrawlerDetect (if available)
$isBotCallable
private
static callable|null
$isBotCallable
= null
Callable wrapper to is_bot function if present (accepts UA or none)
$isBotFilePath
private
static string|null
$isBotFilePath
= null
Path of loaded is_bot.php (if loaded)
$isBotFileTried
private
static bool
$isBotFileTried
= false
Whether is_bot file load was attempted
$logger
private
static T4LOG
$logger
Logger
$userAgent
private
string
$userAgent
Current user agent string used by instance
Methods
__construct()
Constructor.
public
__construct(T4LOG $logger[, string|null $userAgent = null ][, bool $extends = false ][, string|null $isBotFile = "Lib/is_bot.php" ]) : mixed
Parameters
- $logger : T4LOG
- $userAgent : string|null = null
-
User-Agent string; if null, uses $_SERVER['HTTP_USER_AGENT'] or empty string
- $extends : bool = false
-
If true, attempt to load and use CrawlerDetect (composer) when required
- $isBotFile : string|null = "Lib/is_bot.php"
-
Path to is_bot.php (optional). If null, no file is auto-loaded here (but can be loaded later via static loadIsBotFile()).
debugState()
Return a debug array showing current internal state (safe for logging).
public
debugState() : array<string, mixed>
Be careful not to log sensitive UA in production logs unless necessary.
Return values
array<string, mixed>detectBotName()
Try to detect the bot's canonical name (if any).
public
detectBotName() : string|null
- First tries CrawlerDetect (if available) for a match or name (best-effort).
- Then attempts to call is_bot callable which might also identify names (not standardized).
- Finally uses UA pattern matching for common bots.
Return values
string|null —Bot name if detected, null otherwise
getIsBotFilePath()
Return the path of the loaded is_bot file, or null.
public
static getIsBotFilePath() : string|null
Return values
string|nullgetUserAgent()
Get the user agent string for the current request
public
getUserAgent() : string
Return values
string —The user agent string from the HTTP headers
isBot()
Primary method: is this UA a bot?
public
isBot() : bool
Logic:
- If is_bot callable available, call it. If it returns truthy -> true.
- If it returns falsy, and extends === true and CrawlerDetect available -> use CrawlerDetect->isCrawler(UA)
- If neither available, fallback to light heuristic (checks for common bot keywords).
Return values
boolisCrawlerDetectAvailable()
Returns whether CrawlerDetect instance is available.
public
static isCrawlerDetectAvailable() : bool
Return values
boolisHuman()
Convenience alias: not bot => human
public
isHuman() : bool
Return values
boolisIsBotFileLoaded()
Returns true if is_bot() callable is available (loaded).
public
static isIsBotFileLoaded() : bool
Return values
boolpreferredDetector()
Return which detector would be used first (string)
public
preferredDetector() : string
Return values
stringrefreshLoads()
Force attempt to (re)load CrawlerDetect and/or is_bot file.
public
refreshLoads([string|null $isBotFilePath = null ][, bool $forceReload = false ]) : void
Useful for testing or if autoloader was registered later.
Parameters
- $isBotFilePath : string|null = null
-
optional path to is_bot.php to (re)load
- $forceReload : bool = false
-
whether to force reload even if tried before
setUserAgent()
Change user agent for this detector and optionally refresh loaders.
public
setUserAgent(string $ua[, bool $refreshLoaders = false ]) : void
Parameters
- $ua : string
- $refreshLoaders : bool = false
toArray()
Return a debug array showing current internal state (safe for logging).
public
toArray() : array<string, mixed>
Be careful not to log sensitive UA in production logs unless necessary.
Return values
array<string, mixed>heuristicIsBot()
Heuristic check for bots: search for common bot tokens.
private
heuristicIsBot(string $ua) : bool
Parameters
- $ua : string
Return values
boolloadCrawlerDetectIfNeeded()
Attempt to autoload Composer (searches typical vendor/autoload.php locations) and instantiate CrawlerDetect.
private
static loadCrawlerDetectIfNeeded() : bool
Returns true if instance is available.
Return values
boolloadIsBotFile()
Safely require_once an is_bot.php file and detect a callable function.
private
static loadIsBotFile(string $path[, string|null $functionName = 'is_bot' ]) : bool
Returns true if a usable is_bot function was found after loading.
The function expected name is "is_bot" by default. If the file defines a different function you can optionally pass the function name via $functionName.
Parameters
- $path : string
- $functionName : string|null = 'is_bot'