hRequests A Webscraping and Automation Library

hRequests A Webscraping and Automation Library
Scraping

Developed by daijro, hrequests (human requests) is designed to be a simple yet configurable replacement for Python's requests library. It brings a host of new features while maintaining ease of use.

Key Features:

  • Seamless Transition: Switch effortlessly between HTTP requests and headless browsing.
  • High-Performance: Leverages gevent for concurrency without needing monkey-patching.
  • Advanced Parsing: Comes with a fast, integrated HTML parser.
  • JavaScript Rendering: Fully supports JS rendering, crucial for modern web scraping.
  • HTTP/2 Support: Stays current with the latest web protocols.
  • Browser TLS Fingerprints: Mimics actual browser traffic for improved stealth.
  • Efficient JSON Handling: Up to 10x faster JSON serializing compared to the standard library.
  • Browser Crawling: Simplified browser automation, complete with human-like interaction simulations.

Easily installable via pip: pip install -U hrequests[all] and python -m playwright install firefox chromium.

Usage and Documentation

Comprehensive documentation available on Gitbook details everything from simple GET requests to complex browser automation tasks. The module supports standard HTTP methods like GET, POST, PUT, DELETE, HEAD, OPTIONS, and PATCH, with responses closely resembling those from the original requests library.

Concurrent & Lazy Requests

hrequests introduces concepts like nohup requests for background operations and array-based concurrency for handling multiple requests simultaneously.

HTML Parsing and Browser Automation

Leveraging PyQuery for efficient HTML scraping, hrequests allows detailed parsing and interaction with web content. Browser automation supports both headless and graphical browsers, with extensions for Chrome and Firefox.

Exception Handling

The library offers robust methods to handle exceptions and timeouts, ensuring reliable script execution.

Community and Contributions

Maintained by the Python community, hrequests is a testament to collaborative development. Contributions and feedback are always welcome, fostering continuous improvement.

Licensing and Support

Licensed under the Apache Software License (Apache-2.0), hrequests guarantees compatibility with a wide range of projects. Python versions from 3.7 to 3.11 are supported.

In conclusion, hrequests is a powerful tool for Python developers, offering enhanced features and capabilities for web requests and scraping. Its user-friendly design, coupled with advanced functionalities, makes it an excellent choice for both beginners and seasoned professionals in Python web development.