logo
down
shadow

Scrapy simulate XHR request - returning 400


Scrapy simulate XHR request - returning 400

By : ramesh
Date : November 21 2020, 07:31 AM
I wish did fix the issue. I'm trying to get data from a site using Ajax. The page loads and then Javascript requests the content. See this page for details: https://www.tele2.no/mobiltelefon.aspx , The key problem is in missing quotes around the filters in the body:
code :
url = 'https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters'
req = scrapy.Request(url,
                     method='POST',
                     body='{"filters": []}',
                     headers={'X-Requested-With': 'XMLHttpRequest',
                              'Content-Type': 'application/json; charset=UTF-8'},
                     callback=self.parser2)
yield req
params = {"filters": []}
req = scrapy.Request(url,
                     method='POST',
                     body=json.dumps(params),
                     headers={'X-Requested-With': 'XMLHttpRequest',
                              'Content-Type': 'application/json; charset=UTF-8'},
                     callback=self.parser2)
2014-12-30 12:30:38-0500 [tele2] DEBUG: Crawled (200) <GET https://www.tele2.no/mobiltelefon.aspx/> (referer: None) 
2014-12-30 12:30:42-0500 [tele2] DEBUG: Crawled (200) <POST https://www.tele2.no/Services/Webshop/FilterService.svc/ApplyPhoneFilters> (referer: https://www.tele2.no/mobiltelefon.aspx/) 
test


Share : facebook icon twitter icon
Scrapy Not Returning After Yielding a Request

Scrapy Not Returning After Yielding a Request


By : Albatross
Date : March 29 2020, 07:55 AM
may help you . You need to return (or yield) an item from the aftersubmitcallback method. Quote from docs:
code :
def aftersubmit(self, response):
    hxs = Selector(response)

    item = AnItem()
    item['Name'] = "jsc"
    return item
How to simulate XHR request in Scrapy for dynamically loading web pages?

How to simulate XHR request in Scrapy for dynamically loading web pages?


By : PeterB
Date : March 29 2020, 07:55 AM
With these it helps You should take a look at FormRequest that enables you to send data via HTTP POST. As you can see the next button creates a request to http://www.olx.in/ajax/newdelhi/search/list/, with some form data. Just populate the formdata parameter with the needed values from the current Response object. As you are trying to build a pagination you should check this page on how to do it properly.
How to simulate xhr request using Scrapy when trying to crawl data from an ajax-based webstie?

How to simulate xhr request using Scrapy when trying to crawl data from an ajax-based webstie?


By : qawsedrftg
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further I am new to crawl the webpage using Scrapy and unfortunately chose a dynamic one to start... , Try this:
code :
class googleAppSpider(Spider):
    name = "googleApp"
    allowed_domains = ['play.google.com']
    start_urls = ['https://play.google.com/store/apps/category/GAME/collection/topselling_new_free?authuser=0']

    def parse(self,response):
        for i in range(0,10): 
            yield FormRequest(url="https://play.google.com/store/apps/category/GAME/collection/topselling_new_free?authuser=0", method="POST", formdata={'start':str(i*60),'num':'60','numChildren':'0','ipf':'1','xhr':'1','token':'m1VdlomIcpZYfkJT5dktVuqLw2k:1455483261011'}, callback=self.data_parse)

    def data_parse(self,response):
        item = googleAppItem()
        map = {}
        links = response.xpath("//a/@href").re(r'/store/apps/details.*')
        for l in links:
            if l not in map:
                map[l] = True
                item['url'] = l
                yield item
Python Scrapy not executing scrapy.Request callback function for every link

Python Scrapy not executing scrapy.Request callback function for every link


By : Chalino Sanchez
Date : March 29 2020, 07:55 AM
scrapy ,Why does the scrapy.Request class call the parse() method by default?

scrapy ,Why does the scrapy.Request class call the parse() method by default?


By : Mene
Date : March 29 2020, 07:55 AM
Hope that helps This is something that is decided inside the Scrapy core, see this request.callback or spider.parse part:
code :
def call_spider(self, result, request, spider):
    result.request = request
    dfd = defer_result(result)
    dfd.addCallbacks(request.callback or spider.parse, request.errback)
    return dfd.addCallback(iterate_spider_output)
Related Posts Related Posts :
  • Remove commas in a string, surrounded by a comma and double quotes / Python
  • How to chain Django querysets preserving individual order
  • Comparison with Python
  • How to find backlinks in a website with python
  • Return new instance of subclass when using methods inherited from parent class in Python
  • Which function in django.contrib.auth creates the default model permissions?
  • Formatting text in tabular form with Python
  • How to determine the first day of a month in Python
  • Error while converting date to timestamp in python
  • Python string iterations
  • Is there any file number limitation when you select multiple files with wxFileDialog?
  • Errors with Matplotlib when making an executable with Py2exe (Python)
  • Django Haystack - Indexing single field
  • Go Pro Hero 3 - Streaming video over wifi
  • Appending a column in .csv with Python/Pandas
  • How to change my result directory in Robot framework using RIDE?
  • problem with using pandas to manipulate a big text file in python
  • python-magic module' object has no attribute 'open'
  • Where goes wrong for this High Pass Filter in Python?
  • Why inserting keys in order into a python dict is faster than doint it unordered
  • flann index saving in python
  • Create new instance of list or dictionary without class
  • How can I easily convert FORTRAN code to Python code (real code, not wrappers)
  • Address of lambda function in python
  • Python adding space between characters in string. Most efficient way
  • python http server, multiple simultaneous requests
  • Disguising username & password on distributed python scripts
  • Post GraphQL mutation with Python Requests
  • Why doesnt pandas create an excel file?
  • Rolling comparison between a value and a past window, with percentile/quantile
  • How to avoid repetitive code when defining a new type in python with signature verification
  • How to configure uWSGI in order to debug with pdb (--honour-stdin configuration issue)
  • In Python, how do you execute objects that are functions from a list?
  • Python- Variable Won't Subtract?
  • Processing Power In Python
  • Python 2.7.2 - Cannot import name _random or random from sys
  • Why doesn't the Python sorted function take keyword order instead of reverse?
  • Make a function redirect to other functions depending on a variable
  • get_absolute_url in django-categories
  • Monitoring non-Celery background task with New Relic in Python
  • Feature selection with LinearSVC
  • LSTM - Predicting the same constant values after a while
  • Test the length of elements in a list
  • Django: render radiobutton with 3 columns, cost column must change according to size & quantity selected
  • Python class attributes vs global variable
  • sys.stdout.writelines("hello") and sys.stdout.write("hello")
  • is ndarray faster than recarray access?
  • Python - search through directory trees, rename certain files
  • GAE: How to build a query where a string begins with a value
  • TypeError: __init__() takes at least 2 arguments (1 given)
  • Overriding and customizing "django.contrib.auth.views.login"
  • Django : Redirect to a particular page after login
  • Python search and copy files in directory
  • pretty printing numpy ndarrays using unicode characters
  • Frequent pattern mining in Python
  • How can I make a set of functions that can be used synchronously as well as asynchronously?
  • Convert one dice roll to two dice roll
  • count occourrence in a list
  • Writing an If condition to filter out the first word
  • to read file and compare column in python
  • shadow
    Privacy Policy - Terms - Contact Us © ourworld-yourmove.org