Enterprise Recon 2.10.0

Distributed Scan

This section covers the following topics:

You can use ER2 to perform a distributed scan on a Target or Target location using a group of Proxy Agents. Distributed scans allow you to:

  1. Improve scanning time by having multiple scanning processes executed in parallel.
  2. Optimize resources by distributing the scanning load across multiple Proxy Agent hosts which might otherwise have been unutilized.

Distributed scans are particularly useful for scanning Targets that have a vast number of locations, for example:

  • An Exchange Server with thousands of mailboxes.
  • A Microsoft SQL Server with hundreds of databases, with thousands of tables per database.

For more information, see Distributed Scan Requirements below.

How a Distributed Scan Works

When a distributed scan starts, the Master Server begins by collecting information about the Target(s) and the Proxy Agents in the Agent Group assigned to the scan. The Master Server uses this information to break down the Target(s) into smaller components or sub-scans, then proceeds to distribute the scan workload among the Proxy Agents that are online and available.

Each Proxy Agent then starts to execute the assigned sub-scans on the Target(s). Results for the Target(s) are progressively processed and displayed in the Web Console as each sub-scan completes. While the distributed scan is in progress, if any Proxy Agent becomes idle (after completing all assigned tasks) or is newly connected, outstanding tasks from other Proxy Agents will be dynamically reallocated to these available Agents to further improve the overall scan time.

A distributed scan schedule is marked as "Complete" only when all sub-scans distributed among all Proxy Agents have been completed.

Distributed Scan Requirements

Proxy Agent Requirements

To perform a distributed scan on a Target or group of Targets, you need to Create an Agent Group to be assigned to the Target or Target location. Ensure that all Proxy Agents in the Agent Group:

  • Have been upgraded to version 2.1 and above.
  • Support scanning of the Target platform.

Supported Targets

You can run a distributed scan on the following supported Target types:

Target Type Description
Windows Share Scans are distributed across the folders and files under the Path of the network storage location as specified in the scan schedule.
If the number of files under the Path exceeds a certain limit,
  • distributed scanning will be disabled for the scan schedule,
  • the change will be captured in the Activity Log, and
  • the network storage Path will then be assigned to a single Proxy Agent from the Agent Group.
Remote Access via SSH Scans are distributed across the folders and files under the Path of the network storage location as specified in the scan schedule.
If the number of files under the Path exceeds a certain limit,
  • distributed scanning will be disabled for the scan schedule,
  • the change will be captured in the Activity Log, and
  • the network storage Path will then be assigned to a single Proxy Agent from the Agent Group.
IBM DB2 Scans are distributed across the tables in the database.
InterSystems Caché Scans are distributed across the tables in the database.
MongoDB Scans are distributed across the collections in the MongoDB Server.
MariaDB Scans are distributed across the tables in the database.
Microsoft SQL Server Scans are distributed across the tables in the database.
MySQL Scans are distributed across the tables in the database.
Oracle Database Scans are distributed across the tables in the database.
PostgreSQL Scans are distributed across the tables in the database.
SAP HANA Scans are distributed across the tables in the database.
Sybase / SAP ASE Scans are distributed across the tables in the database.
SharePoint Server Scans are distributed across the sites in the SharePoint Server.
Confluence On-Premises

Scans are distributed across the spaces, blog post folder, and/or top-level pages that are one-level below the selected location(s).

Example 1

When the entire Confluence domain is selected, the scans will be distributed across each space (e.g. Space Engineering and Space Product) in the domain.

Confluence [host name: my-confluence-server] Confluence on target MY-CONFLUENCE-SERVER Space Engineering Blog Post Folder Blog Post January Space Product Page Feature Page Feature A Page Feature B
Example 2

The scans for Space Engineering will be distributed across the blog post folder (Blog Post Folder) and top-level page (Page Development).

Confluence [host name: my-confluence-server] Confluence on target MY-CONFLUENCE-SERVER Space Engineering Blog Post Folder Blog Post January Blog Post February Page Development Page Bug Fixes Page Enhancements Space Product Page Feature Page Feature A Page Feature B Page Release Page Release Q1 Page Release Q2
Amazon S3 Buckets Scans are distributed across the Amazon S3 Buckets in the Amazon account.
Azure Storage Scans are distributed across the Blobs, Tables or Queues in the Azure Storage account.
Box Inc Scans are distributed across the locations in the Box Inc domain that are selected for the scan schedule. For example, in the scenario below, the scans will be distributed across four locations. Box [domain: example.app.box.com] Group Administration Group Engineering User user1@example.com User user2@example.com Group Finance User user3@example.com User user4@example.com User user5@example.com Group Human Resource Group Sales
Exchange Domain Scans are distributed across the mailboxes in the Exchange domain.
Exchange Online Scans are distributed across the mailboxes in the Microsoft 365 domain.
Google Workspace Scans are distributed across the users in the Google Workspace domain.
Google Cloud Storage Scans are distributed across the buckets in the Google Cloud Storage project.
Microsoft OneNote Scans are distributed across the user or group name notebooks in the Microsoft 365 domain.
Microsoft Teams Scans are distributed across the (i) channels in a team, or (ii) users in a group within the Microsoft 365 domain.
Rackspace Cloud Scans are distributed across the cloud server regions in the Rackspace account.
SharePoint Online Scans are distributed across the sites in the SharePoint Online domain.

Start a Distributed Scan

Running a distributed scan is the same as starting any other scan.

  1. Log in to the ER2 Web Console.
  2. Navigate to the Select Locations page by clicking on:
    • Scans > New Scan, or
    • the New Scan button in the Dashboard, Targets, or Scans > Schedule Manager page.
  3. On the Select Locations page, click + Add Unlisted Target. Follow the on-screen instructions to add a new Target.
  4. When prompted to select an Agent to act as proxy host, click on the Select proxy agent menu and select a suitable Agent Group.
  5. Click Test, and then Commit.
  6. On the Select Data Types page, select the Data Type Profiles to be included in your scan and click Next. See Data Type Profiles.
  7. Set a scan schedule in the Set Schedule section. Click Next.
  8. Review your scan configuration. Once done, click Start Scan.

Monitor a Distributed Scan Schedule

Distributed scans show up in the Targets page and Scans > Schedule Manager page in the Web Console just like any other scan. See View and Manage Scans for more information.