Python library for rendering HTML and javascript

Is there any python module for rendering a HTML page with javascript and get back a DOM object?

I want to parse a page which generates almost all of its content using javascript.

-------------Problems Reply------------

The big complication here is emulating the full browser environment outside of a browser. You can use stand alone javascript interpreters like Rhino and SpiderMonkey to run javascript code but they don't provide a complete browser like environment to full render a web page.

If I needed to solve a problem like this I would first look at how the javascript is rendering the page, it's quite possible it's fetching data via AJAX and using that to render the page. I could then use python libraries like simplejson and httplib2 to directly fetch the data and use that, negating the need to access the DOM object. However, that's only one possible situation, I don't know the exact problem you are solving.

Other options include the selenium one mentioned by Łukasz, some kind of webkit embedded craziness, some kind of IE win32 scripting craziness or, finally, a pyxpcom based solution (with added craziness). All these have the drawback of requiring pretty much a fully running web browser for python to play with, which might not be an option depending on your environment.

QtWebKit is contained in PyQt4, but I don't know if you can use it without showing a widget. After a cursory look over the documentation, it seems to me you can only get HTML, not a DOM tree.

Only way I know to accomplish this would be to drive real browser, for example using selenium-rc.

You can probably use python-webkit for it. Requires a running glib and GTK, but that's probably less problematic than wrapping the parts of webkit without glib.

I don't know if it does everything you need, but I guess you should give it a try.

Category:javascript Views:0 Time:2008-09-24

Related post

  • Is there a LGPL/Apache/BSD Python library for rendering modern HTML and Flash with a transparent background on Windows,Mac,Linux? 2009-08-21

    I'm looking for a Python library that's suitable, with DOM access too. I don't mind if the flash transparency doesn't carry over. PyQT's license isn't compatible with the project, and PySide isn't compiled cross-platform yet. Any thoughts? ----------

  • What is the best free JavaScript obfuscator that is available as a javascript library or python library 2010-12-19

    i Follow this article :Free JavaScript obfuscators? to find a javascript-obfuscators that can help me to obfuscate some javascript data on the client side . but i find these chooses are all not python or javascript lib,they use java, and i use django

  • Javascript library for rendering HTML form elements 2012-02-08

    Does anyone know of a JavaScript library for rendering form elements? For example for a select element, I'd like to be able supply classes, an id and an array of items for rendering into an html select element. --------------Solutions------------- I'

  • Am I passing the string correctly to the python library? 2010-01-29

    I'm using a python library called Guess Language: http://pypi.python.org/pypi/guess-language/0.1 "justwords" is a string with unicode text. I stick it in the package, but it always returns English, even though the web page is in Japanese. Does anyone

  • how to access chrome pdf rendering capability from javascript 2011-08-10

    I need to know how to access chrome pdf rendering capability from javascript in chrome (but also firefox etc, now I start with chrome). I want to be able to manage single pages and I am interested in performance. I see that chrome is fast at renderin

  • Analizing MIPS binaries: is there a Python library for parsing binary data? 2008-09-05

    I'm working on a utility which needs to resolve hex addresses to a symbolic function name and source code line number within a binary. The utility will run on Linux on x86, though the binaries it analyzes will be for a MIPS-based embedded system. The

  • Does re.compile() or any given Python library call throw an exception? 2008-09-12

    I can't tell from the Python documentation whether the re.compile(x) function may throw an exception (assuming you pass in a string). I imagine there is something that could be considered an invalid regular expression. The larger question is, where d

  • What (pure) Python library to use for AES 256 encryption? 2008-10-05

    I am looking for a (preferably pure) python library to do AES 256 encryption and decryption. This library should support the CBC cipher mode and use PKCS7 padding according to the answer to an earlier question of mine. The library should at least wor

  • Python library to modify MP3 audio without transcoding 2008-11-22

    I am looking for some general advice about the mp3 format before I start a small project to make sure I am not on a wild-goose chase. My understanding of the internals of the mp3 format is minimal. Ideally, I am looking for a library that would abstr

  • Is there a Python library to interact with Genesys? 2009-03-31

    I work within a contact center and we use Genesys with an Alcatel PBX. We currently use .NET libraries to wrap the DLL's provided to us, but I'm wondering if there are any Python libraries out there before I attempt to write my own? Thanks Edit: Jyth

  • Is there a python library that implements both sides of the AWS authentication protocol? 2009-03-31

    I am writing a REST service in python and django and wanted to use Amazon's AWS authentication protocol. I was wondering if anyone knew of a python library that implemented formation of the header for sending and the validation of the header for reci

  • Looking for a pure Python library for the SyncML protocol 2009-05-06

    I'm looking for an open source, pure Python library that supports the SyncML protocol, at least enough to implement a SyncML client. --------------Solutions------------- There's https://sourceforge.net/projects/pysyncml/: "The pysyncml library is a p

  • Any python library to access quickbooks? 2009-05-22

    I want to integrate my mobile POS system with quickbooks (I need customers, sellers, inventory & post orders & invoices). I have not not been able to find a python library for this. --------------Solutions------------- This page walks through

  • Python library for XSS filtering? 2009-05-23

    Is there a good, actively maintained python library available for filtering malicious input such as XSS? --------------Solutions------------- If you are using a web framework and a template engine like Jinja2 there is a chance that the template engin

  • Python library for playing fixed-frequency sound 2009-06-10

    I have a mosquito problem in my house. This wouldn't usually concern a programmers' community; However, I've seen some devices that claim to deter these nasty creatures by playing a 17Khz tone. I would like to do this using my laptop. One method woul

  • Python library for converting files to MP3 and setting their quality 2009-08-07

    I'm trying to find a Python library that would take an audio file (e.g. .ogg, .wav) and convert it into mp3 for playback on a webpage. Also, any thoughts on setting its quality for playback would be great. Thank you. --------------Solutions----------

  • How to distribute slightly modified GPL python library that's abandoned? 2009-08-12

    My apologies if this question has been asked. I have made a bug fix in a small python library. This particular library is licensed under GPL but the development of it seems to be dead and abandoned (very low activity on their tracker, to which I have

  • Best Python Library for Downloading and Extracting Addresses 2009-08-19

    I've just been given a project which involves the following steps Grab an email from a POP3 address Open an attachment from the email Extract the To: email address Add this to a global suppression list I'd like to try and do this in Python even thoug

  • Python library for syntax highlighting 2009-10-10

    Which Python library for syntax highlighting is the best one? I'm interested in things like supported languages, ease of use, pythonic design, dependencies, development status, etc. --------------Solutions------------- I think pygments is the greates

Copyright (C) dskims.com, All Rights Reserved.

processed in 0.086 (s). 11 q(s)