Task


Task

A task is a process triggered by a spider which crawls data from websites, performs specific operations, or serves other functionalities. It is the basic unit of the execution process of spiders.

In Crawlab, you can not only run tasks through only a single click, but also be able to visually view task info such as stats, realtime logs and crawled data. Furthermore, you can set Priority of tasks in order to determine their execution sequence.

Run Task

You can either run a task from spider, or follow the steps below.

  1. Navigate to Tasks page.
  2. Click New Tasks button on the top left.
  3. Select Spider and choose other settings.
  4. Click Confirm.

Restart Task

  1. Navigate to Tasks page.
  2. Click Restart button on the right.

Monitor Task

Crawlab provides task monitoring functionalities to allow you to closely watch the results and performance of your crawling tasks.

View Logs

You can view realtime logs in Crawlab.

  1. Navigate to task detail page.
  2. Click Logs tab.

View Data

You can view crawled data in realtime.

  1. Navigate to task detail page.
  2. Click Data tab.

Cancel Task

Once a task is Pending or Running, you can cancel it by either

  1. clicking on Cancel button on the right in Tasks page, or
  2. clicking on Cancel button on the nav bar in task detail page.