FileSafetyScanner
in package
FileSafetyScanner
Efficient, chunked file scanner using fopen/fread to avoid large memory usage.
Table of Contents
Properties
- $caseInsensitive : bool
- $chunkSize : int
- $excludeExtensions : array<string|int, string>
- $excludePaths : array<string|int, string>
- $findingCallback : callable|null
- $findings : array<int, array<string|int, mixed>>
- $includeExtensions : array<string|int, string>
- $logger : LoggerInterface|null
- $maxBytesPerFile : int|null
- $maxSignatureLength : int
- $signatures : array<int, string|array<string|int, mixed>>
- Signatures to search for.
Methods
- __construct() : mixed
- clearFindings() : void
- Clear findings.
- getFindings() : array<int, array<string|int, mixed>>
- Get accumulated findings (array).
- isFileSafe() : void
- Scan a single file using fopen/fread and chunked scanning.
- isSafe() : void
- Scan given path. If path is a directory, scan recursively.
- setFindingCallback() : void
- Set callback to be invoked for each finding.
- computeMaxSignatureLength() : void
- Compute maximum signature length (used to determine overlap between chunks).
- defaultSignatures() : array<int, string>
- Default signatures to look for (as required in prompt).
- isPathExcluded() : bool
- Decide whether a path should be excluded by excludePaths.
- normalizeExtensions() : array<string|int, string>
- Normalize extensions to lowercase, no dot.
- recordFindingFromBuffer() : void
- Record a finding based on a match inside the concatenated buffer.
- searchBufferForSignatures() : void
- Search given buffer (which is prevTail + chunk) for configured signatures.
- shouldScanByExtension() : bool
- Decide whether to scan file by extension rules.
Properties
$caseInsensitive
private
bool
$caseInsensitive
Whether string searches are case-insensitive (stripos)
$chunkSize
private
int
$chunkSize
Default chunk size (8 MB). Can be increased (e.g. 50MB) but be careful with memory.
$excludeExtensions
private
array<string|int, string>
$excludeExtensions
Blacklist of extensions (no dot) to skip. Ignored when includeExtensions not empty.
$excludePaths
private
array<string|int, string>
$excludePaths
Paths (substrings) or regex patterns to exclude (directories/files)
$findingCallback
private
callable|null
$findingCallback
= null
Callback invoked when a finding is detected: function(array $finding): void
$findings
private
array<int, array<string|int, mixed>>
$findings
= []
Accumulated findings
$includeExtensions
private
array<string|int, string>
$includeExtensions
Whitelist of extensions (no dot), if non-empty => only these are scanned (priority).
$logger
private
LoggerInterface|null
$logger
Logger to use (null = no logging)
$maxBytesPerFile
private
int|null
$maxBytesPerFile
Maximum bytes to scan per file (null = unlimited)
$maxSignatureLength
private
int
$maxSignatureLength
= 0
Maximum length of signature (bytes) used for overlap
$signatures
Signatures to search for.
private
array<int, string|array<string|int, mixed>>
$signatures
Each item can be:
- string (literal substring search)
- array of form ['regex' => '...'] to use preg_match (PCRE)
Methods
__construct()
public
__construct([array{chunkSize?: int, maxBytesPerFile?: int|null, includeExtensions?: string[], excludeExtensions?: string[], excludePaths?: string[], signatures?: array, caseInsensitive?: bool} $options = [] ][, LoggerInterface|null $logger = null ]) : mixed
Parameters
-
$options
: array{chunkSize?: int, maxBytesPerFile?: int|null, includeExtensions?: string[], excludeExtensions?: string[], excludePaths?: string[], signatures?: array
, caseInsensitive?: bool} = [] - $logger : LoggerInterface|null = null
clearFindings()
Clear findings.
public
clearFindings() : void
getFindings()
Get accumulated findings (array).
public
getFindings() : array<int, array<string|int, mixed>>
Return values
array<int, array<string|int, mixed>>isFileSafe()
Scan a single file using fopen/fread and chunked scanning.
public
isFileSafe(string $filePath) : void
Parameters
- $filePath : string
isSafe()
Scan given path. If path is a directory, scan recursively.
public
isSafe(string $path) : void
Parameters
- $path : string
setFindingCallback()
Set callback to be invoked for each finding.
public
setFindingCallback(callable $cb) : void
Callback signature: function(array $finding): void Finding array keys:
- file (string)
- signature (string)
- offset (int) byte offset in file
- line (int) line number (1-based) (approximate)
- snippet (string) surrounding content (up to ~160 chars)
- truncated (bool) whether file scanning was truncated because of maxBytesPerFile
Parameters
- $cb : callable
computeMaxSignatureLength()
Compute maximum signature length (used to determine overlap between chunks).
private
computeMaxSignatureLength() : void
defaultSignatures()
Default signatures to look for (as required in prompt).
private
defaultSignatures() : array<int, string>
Return values
array<int, string>isPathExcluded()
Decide whether a path should be excluded by excludePaths.
private
isPathExcluded(string $path) : bool
Parameters
- $path : string
Return values
boolnormalizeExtensions()
Normalize extensions to lowercase, no dot.
private
normalizeExtensions(array<int, string> $exts) : array<string|int, string>
Parameters
- $exts : array<int, string>
Return values
array<string|int, string>recordFindingFromBuffer()
Record a finding based on a match inside the concatenated buffer.
private
recordFindingFromBuffer(string $filePath, string $matchedSignature, int $posInBuffer, string $buffer, int $filePosStart, int $prevTailLen, int $lineOffset, bool $truncated) : void
Parameters
- $filePath : string
- $matchedSignature : string
- $posInBuffer : int
-
position of match inside $buffer
- $buffer : string
-
full buffer (prevTail + chunk)
- $filePosStart : int
-
bytes already read before current chunk (start of chunk)
- $prevTailLen : int
-
length of prevTail appended before chunk
- $lineOffset : int
-
approximate line offset at start of chunk
- $truncated : bool
searchBufferForSignatures()
Search given buffer (which is prevTail + chunk) for configured signatures.
private
searchBufferForSignatures(string $buffer, string $filePath, int $filePosStart, int $prevTailLen, int $lineOffset, bool $truncated) : void
Parameters
- $buffer : string
-
The concatenated buffer
- $filePath : string
- $filePosStart : int
-
Bytes already consumed before current chunk (position of chunk start)
- $prevTailLen : int
-
Length of prevTail prepended to buffer
- $lineOffset : int
-
Approximate line number offset at start of chunk (1-based). // This function may update this by reference.
- $truncated : bool
-
Whether scanning of file will be truncated (passed to callback)
shouldScanByExtension()
Decide whether to scan file by extension rules.
private
shouldScanByExtension(string $file) : bool
Parameters
- $file : string