Weird actor

Under maintenance

Pricing

Pay per usage

Try for free

Go to Apify Store

Weird actor

Under maintenance

Try for free

Pricing

Pay per usage

Rating

0.0

(0)

Developer

Zuzana Řezáčová

Maintained by Community

Actor stats

Bookmarked

Total users

Monthly active users

a year ago

Last modified

.actor/Dockerfile

# Specify the base Docker image. You can read more about
# the available images at https://docs.apify.com/sdk/js/docs/guides/docker-images
# You can also use any other image from Docker Hub.
FROM apify/actor-node:20

# Check preinstalled packages
RUN npm ls crawlee apify puppeteer playwright

# Copy just package.json and package-lock.json
# to speed up the build using Docker layer cache.
COPY package*.json ./

# Install NPM packages, skip optional and development dependencies to
# keep the image small. Avoid logging too much and print the dependency
# tree for debugging
RUN npm --quiet set progress=false \
    && npm install --omit=dev --omit=optional \
    && echo "Installed NPM packages:" \
    && (npm list --omit=dev --all || true) \
    && echo "Node.js version:" \
    && node --version \
    && echo "NPM version:" \
    && npm --version \
    && rm -r ~/.npm

# Next, copy the remaining files and directories with the source code.
# Since we do this after NPM install, quick build will be really fast
# for most source file changes.
COPY . ./


# Run the image.
CMD npm start --silent

.actor/actor.json

{
    "actorSpecification": 1,
    "name": "my-actor-22",
    "title": "Scrape single page in JavaScript",
    "description": "Scrape data from single page with provided URL.",
    "version": "0.0",
    "meta": {
        "templateId": "js-start"
    },
    "input": "./input_schema.json",
    "dockerfile": "./Dockerfile"
}

.actor/input_schema.json

{
    "title": "Scrape data from a web page",
    "type": "object",
    "schemaVersion": 1,
    "properties": {
        "url": {
            "title": "URL of the page",
            "type": "string",
            "description": "The URL of website you want to get the data from.",
            "editor": "textfield",
            "prefill": "https://www.apify.com/"
        }
    },
    "required": ["url"]
}

.dockerignore

# configurations
.idea

# crawlee and apify storage folders
apify_storage
crawlee_storage
storage

# installed files
node_modules

# git folder
.git

.gitignore

# This file tells Git which files shouldn't be added to source control
.DS_Store
.idea
dist
node_modules
apify_storage
storage/*
!storage/key_value_stores
storage/key_value_stores/*
!storage/key_value_stores/default
storage/key_value_stores/default/*
!storage/key_value_stores/default/INPUT.json

package.json

{
    "name": "js-scrape-single-page",
    "version": "0.0.1",
    "type": "module",
    "description": "This is an example of an Apify actor.",
    "engines": {
        "node": ">=18.0.0"
    },
    "dependencies": {
        "apify": "^3.2.6",
        "axios": "^1.5.0",
        "cheerio": "^1.0.0-rc.12"
    },
    "scripts": {
        "start": "node ./src/main.js",
        "test": "echo \"Error: oops, the actor has no tests yet, sad!\" && exit 1"
    },
    "author": "It's not you it's me",
    "license": "ISC"
}

src/main.js

1// Axios - Promise based HTTP client for the browser and node.js (Read more at https://axios-http.com/docs/intro).
2import axios from 'axios';
3// Cheerio - The fast, flexible & elegant library for parsing and manipulating HTML and XML (Read more at https://cheerio.js.org/).
4import * as cheerio from 'cheerio';
5// Apify SDK - toolkit for building Apify Actors (Read more at https://docs.apify.com/sdk/js/).
6import { Actor } from 'apify';
7// this is ESM project, and as such, it requires you to specify extensions in your relative imports
8// read more about this here: https://nodejs.org/docs/latest-v18.x/api/esm.html#mandatory-file-extensions
9// import { router } from './routes.js';
10
11// The init() call configures the Actor for its environment. It's recommended to start every Actor with an init().
12await Actor.init();
13
14// Structure of input is defined in input_schema.json
15const input = await Actor.getInput();
16const { url } = input;
17
18// Fetch the HTML content of the page.
19const response = await axios.get(url);
20
21// Parse the downloaded HTML with Cheerio to enable data extraction.
22const $ = cheerio.load(response.data);
23
24// Extract all headings from the page (tag name and text).
25const headings = [];
26$("h1, h2, h3, h4, h5, h6").each((i, element) => {
27    const headingObject = {
28        level: $(element).prop("tagName").toLowerCase(),
29        text: $(element).text(),
30    };
31    console.log("Extracted heading", headingObject);
32    headings.push(headingObject);
33});
34
35// Save headings to Dataset - a table-like storage.
36await Actor.pushData(headings);
37
38// Gracefully exit the Actor process. It's recommended to quit all Actors with an exit().
39await Actor.exit();

My Actor

rezaczu/actor-name

Zuzana Řezáčová

My Actor

test-fest/my-actor

test-fest

4.1

PPR Actor

ruly_villa/ppr-actor

Zuzana Štětinová

4.7

PPR Actor

infinite_entropy/ppr-actor

Hot Rod

PPR Actor

rezaczu/ppr-actor

Zuzana Řezáčová

PPE Actor

infinite_entropy/ppe-actor

Hot Rod

PPE actor

ruly_villa/ppe-actor

Zuzana Štětinová

Actor 1

jkuzz/actor-1

Jan Kuželík

5.0

Testing ppr actor

rezaczu/testing-ppr-actor

Zuzana Řezáčová

4.0

PPR testing actor

knowing_didgeridoo/ppr-testing-actor

Jan Novotny

# Specify the base Docker image. You can read more about # the available images at https://docs.apify.com/sdk/js/docs/guides/docker-images # You can also use any other image from Docker Hub. FROM apify/actor-node:20 # Check preinstalled packages RUN npm ls crawlee apify puppeteer playwright # Copy just package.json and package-lock.json # to speed up the build using Docker layer cache. COPY package*.json ./ # Install NPM packages, skip optional and development dependencies to # keep the image small. Avoid logging too much and print the dependency # tree for debugging RUN npm --quiet set progress=false \ && npm install --omit=dev --omit=optional \ && echo "Installed NPM packages:" \ && (npm list --omit=dev --all || true) \ && echo "Node.js version:" \ && node --version \ && echo "NPM version:" \ && npm --version \ && rm -r ~/.npm # Next, copy the remaining files and directories with the source code. # Since we do this after NPM install, quick build will be really fast # for most source file changes. COPY . ./ # Run the image. CMD npm start --silent

{ "actorSpecification": 1, "name": "my-actor-22", "title": "Scrape single page in JavaScript", "description": "Scrape data from single page with provided URL.", "version": "0.0", "meta": { "templateId": "js-start" }, "input": "./input_schema.json", "dockerfile": "./Dockerfile" }

{ "title": "Scrape data from a web page", "type": "object", "schemaVersion": 1, "properties": { "url": { "title": "URL of the page", "type": "string", "description": "The URL of website you want to get the data from.", "editor": "textfield", "prefill": "https://www.apify.com/" } }, "required": ["url"] }

# This file tells Git which files shouldn't be added to source control .DS_Store .idea dist node_modules apify_storage storage/* !storage/key_value_stores storage/key_value_stores/* !storage/key_value_stores/default storage/key_value_stores/default/* !storage/key_value_stores/default/INPUT.json

{ "name": "js-scrape-single-page", "version": "0.0.1", "type": "module", "description": "This is an example of an Apify actor.", "engines": { "node": ">=18.0.0" }, "dependencies": { "apify": "^3.2.6", "axios": "^1.5.0", "cheerio": "^1.0.0-rc.12" }, "scripts": { "start": "node ./src/main.js", "test": "echo \"Error: oops, the actor has no tests yet, sad!\" && exit 1" }, "author": "It's not you it's me", "license": "ISC" }

1// Axios - Promise based HTTP client for the browser and node.js (Read more at https://axios-http.com/docs/intro). 2import axios from 'axios'; 3// Cheerio - The fast, flexible & elegant library for parsing and manipulating HTML and XML (Read more at https://cheerio.js.org/). 4import * as cheerio from 'cheerio'; 5// Apify SDK - toolkit for building Apify Actors (Read more at https://docs.apify.com/sdk/js/). 6import { Actor } from 'apify'; 7// this is ESM project, and as such, it requires you to specify extensions in your relative imports 8// read more about this here: https://nodejs.org/docs/latest-v18.x/api/esm.html#mandatory-file-extensions 9// import { router } from './routes.js'; 10 11// The init() call configures the Actor for its environment. It's recommended to start every Actor with an init(). 12await Actor.init(); 13 14// Structure of input is defined in input_schema.json 15const input = await Actor.getInput(); 16const { url } = input; 17 18// Fetch the HTML content of the page. 19const response = await axios.get(url); 20 21// Parse the downloaded HTML with Cheerio to enable data extraction. 22const $ = cheerio.load(response.data); 23 24// Extract all headings from the page (tag name and text). 25const headings = []; 26$("h1, h2, h3, h4, h5, h6").each((i, element) => { 27 const headingObject = { 28 level: $(element).prop("tagName").toLowerCase(), 29 text: $(element).text(), 30 }; 31 console.log("Extracted heading", headingObject); 32 headings.push(headingObject); 33}); 34 35// Save headings to Dataset - a table-like storage. 36await Actor.pushData(headings); 37 38// Gracefully exit the Actor process. It's recommended to quit all Actors with an exit(). 39await Actor.exit();

Weird actor

Weird actor

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

.dockerignore

.gitignore

package.json

src/main.js

You might also like

My Actor

My Actor

PPR Actor

PPR Actor

PPR Actor

PPE Actor

PPE actor

Actor 1

Testing ppr actor

PPR testing actor

.actor/Dockerfile

.actor/actor.json

.actor/input_schema.json

.dockerignore

.gitignore

package.json

src/main.js