Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
19 views

web-scraping using R selenider on linux error --user-data-dir

I'm attempting to web-scrape using the following R code (which was obtained from this thread: link to other question library(selenider) library(rvest) session <- selenider_session("selenium&...
Nick Amato's user avatar
0 votes
1 answer
46 views

'list' object cannot be coerced to type 'double' error

I've written some code to scrape a webpage, however when I try to make some modifications, I am getting an error. Below is my code library(httr) library(jsonlite) library(dplyr) library(janitor) ...
HowGoodisData's user avatar
0 votes
1 answer
101 views

Way to web-scrape a popular eSport website using R?

I'm attempting to webscrape the following url to obtain live game data: https://egamersworld.com/callofduty/matches I've attempted to inspect the fetch requests being made, but there isn't an obvious ...
Nick Amato's user avatar
-4 votes
0 answers
13 views

I need help in python request package to get register on a website using my own GUI [closed]

How to register on a website using python request package if it has a captcha validation. Actually I am sending a payload to a website server using appropriate headers and all necessary details. but ...
Ishan Kishan's user avatar
0 votes
0 answers
6 views

Amqp nodejs disconnect event

I'm using amqp in my nodejs scrapping service. Sometimes I get random disconnects and when it autoreconnects, it starts a new whenConnected promise but the previous is still running so my app mess up: ...
Rubén M's user avatar
  • 119
-2 votes
0 answers
28 views

Write and fill Excel file with scraped data using Puppeteer and Node.js [closed]

I have a bunch of data stored in an array of objects and now want to fill it into an already existing Excel file. Each property has its column where it needs to be filled into and it should start ...
Honeybadger's user avatar
-4 votes
0 answers
67 views

Webscraping Page issue (code no longer works) [closed]

I had set up some code to scrape the following page https://www.nba.com/stats/players/catch-shoot Below is my code which used to run perfectly fine, but when I tried running it just now I got the ...
gimmethedata123's user avatar
0 votes
0 answers
16 views

Collecting metadata of the reels/ posts sent to yourself from instagram

I need to collect the metadata of the reels/posts that I have sent to myself. The problem I am running in is graphAPI of meta does not allow to access private dm, is there any way around it without ...
Divyam Sharma's user avatar
-1 votes
0 answers
20 views

My telegram bot does not respond to commands [closed]

My telegram bot does not respond to commands. As soon as I start the telegram bot it responds to the commands I entered before turning it on and then it just stops seeing the commands I send. Moreover,...
user25074879's user avatar
0 votes
2 answers
74 views

How to Scrape a JavaScript-Rendered Table? (wait_for_selector Timeout & Data Not Loading) [closed]

I'm trying to scrape a table from a webpage, but the table is dynamically loaded via JavaScript and appears 5-7 seconds after page load when viewed manually. However, when using a web scraper, the ...
HamidBee's user avatar
  • 289
-2 votes
0 answers
41 views

Can't scrape email from GitHub sidebar (vcard) with BeautifulSoup

I'm trying to scrape emails from GitHub profiles. I can get emails from the main section, but I'm unable to scrape the email from the sidebar (vcard) using BeautifulSoup. I can get emails from the ...
Balla P. Tall's user avatar
0 votes
0 answers
44 views

How to Extract Code Blocks from Different Tabs in a Code Documentation Using Crawl4AI (or any other tool)?

I'm trying to scrape code blocks from multiple tabs in a documentation page using Crawl4AI. While I'm able to extract Markdown content, the code blocks inside tabbed sections are not being captured. ...
harsha bajaj's user avatar
0 votes
1 answer
98 views

Failed to identify the reason why my script is missing a few results while scraping a webpage

I've created a script in Python to scrape consultant links from this webpage based on the country filter United States, located in the left sidebar. The webpage shows 2,025 results. However, when I ...
MITHU's user avatar
  • 164
-1 votes
1 answer
46 views

Coordinates of a location in a web page [closed]

I am trying to extract the coordinates e. g. longitude and latitude of the pointer on a map depicted on a static google maps image from a house listing. Example: https://www.zoopla.co.uk/to-rent/...
Chioma Okoroafor's user avatar
0 votes
2 answers
67 views

How to search with xpath selector in "nodriver" on python

I am not sure about the correct way to search for specific items using XPath in nodriver on python. I'm using this for try to select a button with a "confirm" text inside. await tab.select(&...
jguerr's user avatar
  • 3
-1 votes
0 answers
16 views

How can we use coroutines in web scraping?

In the situation of scraping 100 URLs, It can be broadly divided into 3 stages. Stage of accessing the URL Stage of waiting for the page to load Stage of parsing and retrieving the data on the page ...
JAEIK JEONG's user avatar
1 vote
1 answer
96 views

Webscraping instruction for an R user

I am a statistician/data scientist, R user, runner, and a beginner in the realm of webscraping. I recently completed a race in Tampa, FL and the results are posted online. I would like to use some web ...
Omar123456789's user avatar
1 vote
3 answers
68 views

How can I webscrape pdfs under a dropdown button in HTML?

I'm new to scraping websites with HTML and need to download all pdfs from this website, but the info is under dropdown buttons. I tried inspecting the HTML of the website, and I think the code of the ...
aimee prieto's user avatar
0 votes
1 answer
61 views

How can I download PDF's using an AI WebCrawler? (Crawler4AI)

I have been using Crawler4AI to try downloading a series of documents from this Website. However, since it requieres JavaScript code and I am using Python, I don't know hot to solve my error. Code, ...
franjefriten's user avatar
0 votes
1 answer
69 views

Trying to scrape data from a table from a website

I'm trying to pull some data from a table and store it in a CSV file. I'm using the following (all 64-bit): Firefox version 135.0.1 GeckoDriver 0.36.0 Python version is 3.11.0 I'm trying to scrape ...
Machzy's user avatar
  • 23
-1 votes
0 answers
45 views

Connection to socket.io with R websocket package not working

I am trying to get some data from this page, namely game names and odds and rounds: https://www.winamax.fr/paris-sportifs/sports/1/7/4 I first tried using a GET request from the httr package, by ...
M.O's user avatar
  • 471
1 vote
1 answer
51 views

Why is the coroutine not converted and works synchronously even though a delay is given?

runBlocking { bookLinks.mapIndexed { ranking, bookLink -> val job = async { scrapeBookData(browser, bookLink, ranking) } val result = job.await() if (result != null) { ...
JAEIK JEONG's user avatar
1 vote
0 answers
41 views

When using playwright and coroutine for web crawling, the speed is the same as when using coroutine and not using it

package com.example.demo.service import com.example.demo.dto.BookDTO import com.microsoft.playwright.* import com.microsoft.playwright.options.WaitUntilState import jakarta.transaction.Transactional ...
JAEIK JEONG's user avatar
1 vote
0 answers
67 views

How to switch to "puppeteer-real-browser" from default puppeteer? [closed]

i want to change my scraper's "puppeteer" library with "puppeteer-real-browser". I tried so many ways but i got bunch of errors and i dont want to ask all in here to make process ...
kokoKOK's user avatar
  • 11
0 votes
2 answers
56 views

Trying to use chrome with seleniumbase and uc=true option

I am trying to scrape a site that has a cloudflare bot check I currently use import undetected_chromedriver as uc and portable CHROME.EXE however this seems to not get me around the bot check , so ...
RobM's user avatar
  • 835
1 vote
0 answers
116 views

Instagram user web profile info API not working anymore

I've been using this link https://i.instagram.com/api/v1/users/web_profile_info/?username={username}for a while along with the APP ID to make requests and it's been fine but suddenly it now says { ...
pkdev's user avatar
  • 113
1 vote
1 answer
62 views

AWS Lambda webscraping through a docker image

I'm learning AWS Lambda and I'm trying to implement a webscraping program. I created my Lambda function through a container image, that I built through Docker. My project folder has three files: ...
weyronn12934's user avatar
0 votes
0 answers
69 views

Nodriver web scraping program gets stuck at cdp.network.get_response_body?

I'm trying to intercept the response from the web server and extract the body. it uses the module nodriver to successfully load the page and capture the request event. However when it attempts to send ...
Fab49er's user avatar
  • 19
-7 votes
0 answers
70 views

Selenium python scrapping

I do have the following code: from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import ...
Radosław Poprawski's user avatar
0 votes
1 answer
68 views

How can one scrape any table from Wikipedia in Python?

I want to scrape tables from Wikipedia in Python. Wikipedia is a good source to get data from, but the data present is in HTML format which is extremely machine unfriendly and cannot be used directly. ...
Ξένη Γήινος's user avatar
0 votes
0 answers
45 views

crawl4ai gives Error: 'NoneType' object has no attribute 'new_context'

I am trying to scrape data from www.example.com but the below code returns error : import asyncio from crawl4ai import AsyncWebCrawler from crawl4ai.async_configs import BrowserConfig, ...
user9291211's user avatar
0 votes
2 answers
82 views

Playwright Python can't find HTML tag which shows up in debugger and in a print statement

I am trying to scrape a page product detail page but I am not able to find the tag when the code runs. I print the parent tag out, and I see the h2 tag I want, and also when I enter the debugger I can ...
Cody Childers's user avatar
-2 votes
1 answer
92 views

How does the website know that it's not my browser?

When I access the URL https://www.getfpv.com/media/sitemap.xml from my browser it works, but when I try to do it with Python, it returns 403 forbidden. How does the website know that it's python ...
Fab49er's user avatar
  • 19
0 votes
0 answers
27 views

How to fill an input and select an option from the dropdown using Puppeteer?

I am working on a project with JavaScript and chose Puppeteer to perform web scraping on various websites. One of the websites I need to scrape is this one Mapas SII, from which I plan to obtain ...
alvaro soto albornoz's user avatar
0 votes
0 answers
25 views

FeignException - 504 Gateway Time-out

I have a webscraping service that takes a bit too long (sometimes up to 2 hours). But I did a good configuration, that can handle long functions, however I got this error: 2025-02-18T10:40:00.017Z ...
Aziz Zina's user avatar
0 votes
1 answer
30 views

Collect Google Play Reviews in Multiple Countries

I am trying to collect Google Play reviews on certain apps in English-speaking countries using google-play-scraper. The problem was that when I changed the 'country' parameter, it returned the same ...
Sơn Phạm's user avatar
0 votes
1 answer
54 views

webscrape table using rvest

I am attempting to scrape the table on this page using rvest https://www.nrl.com/ladder/?competition=111&round=27&season=2024 This is what I have tried so far library(rvest) page <- ...
HowGoodisData's user avatar
-1 votes
0 answers
31 views

Not able to fix version of the Chromium, chrome webdriver on the AWS Lambda function for the WebScraping

I am using Selenium, chrome driver, chromium to web scrape the Amazon website, It works fine in the local system. But when I used this approach on the Lambda function, then I am getting the versioning ...
Shalini Dixit's user avatar
0 votes
0 answers
52 views

How can I convert an HTML element or React node into an SVG or image?

I am working on a project where I gather data from a user and retrieve stats from their other publicly accessible profiles. Based on this, I generate profile cards or images to display that data. The ...
Sanju Chilukuri's user avatar
0 votes
2 answers
91 views

How to Resolve Google News Redirects to Get the Final Article URL Using Axios?

I'm trying to scrape news articles from Google News using Node.js. The issue I am passing is that the links provided by the RSS feed. They give us this type of link which is a Google Rss Link which ...
Deus's user avatar
  • 13
0 votes
2 answers
58 views

How to do web scraping using pyspark

Hello I've a question how to do web scraping and read the response in pyspark Here's my code import requests import pyspark from pyspark.sql.functions import * from pyspark.sql import SparkSession r =...
Bahy Mohamed's user avatar
0 votes
1 answer
207 views

Selenium ChromeDriver does not navigate to a URL when using a custom user data directory

This is a code that used Selenium to crawl the web, but .get() does not seem to work after updating Chrome and Chrome Driver. Chrome version is "133.0.6943.99" and the Chrome Driver version ...
Mr. OH's user avatar
  • 1
0 votes
2 answers
105 views

Use R to scrape MLB.com player fielding data

I'm learning how to use R to scrape tables of baseball stats from different places on the web. For example, I adapted this post to scrape a player's minor league fielding data from the player register ...
Buckaroo Banzai's user avatar
0 votes
1 answer
83 views

How to scrape website which has hidden data inside table?

I am trying to Scrape Screener.in website to extract some information related to stocks. However while trying to extract Quarterly Results section there are some field which is hidden and when click ...
Data-7scientist's user avatar
2 votes
1 answer
77 views

Puppeteer Scraping: See XHR response data before request completes for real time data

I am using puppeteer to scrape a website for real time data in nodejs. Instead of scraping the page, I am watching the backend requests and capturing the JSON/TEXT responses so I have more structured ...
EdE's user avatar
  • 21
0 votes
1 answer
37 views

Selenium not triggering 'Save' button after modifying a placeholder field in AngularJS page

I am automating a web page using Selenium + Python, and I need to update the Zip Code field. The expected behavior in the UI is: The "Save" button is hidden initially. When I click on the ...
matias cantella's user avatar
0 votes
0 answers
53 views

How to scrape searched youtube videos with puppeteer

I am trying to use nodeJs with puppeteer to scrape for YouTube video information from the search results. Unfortunately, for some reason, the scrape doesn't load the elements via the document query ...
keshawn Sharper's user avatar
2 votes
1 answer
70 views

can't find correct 'select' HTML tag value, and trying to wait for a select option to load, playwright Python

I have an issue where I use a url that ends such as T-shirts page I am trying to scrape the product links off the pages. I have been trying for some time now, nothing is working yet. This is my ...
Cody Childers's user avatar
-1 votes
1 answer
61 views

How to scrape links off Google images result with selenium, python?

I'm trying to work on a project, and I need to get the links off google image results. Here is my code: from selenium.webdriver.common.by import By from selenium.webdriver.common.action_chains import ...
Thomas Haddad's user avatar
0 votes
1 answer
87 views

Pagination error while accessing data using Google Apps Script

I am trying to access url data (clickable titles) from this table. The script gets the first page correctly but I could not find a way to get the data from second page. Here is the sample script: ...
EagleEye's user avatar
  • 510

1
2 3 4 5
1031