PyHtmlUnit

PyHtmlUnit is a very simple implementation of HTML unit testing for Python. It is similar to HtmlUnit Java Library and inspired the API Uncle Bob would have like to find for HTML unit testing. Motivations for this module was to support unit testing of CherryPy applications with the simplest framework possible. (For a more complete framework, take BeautifulSoup.)

Implementation

It is provided as a python module using distutil. To install PyHtmlUnit as a module to your system just use the setup.py script for building (as lambda user) and for installing (as root / admin). Then PyHtmlUnit is widely available on your system.


PyHtmlUnit-x.y.z $ python setup.py build 
PyHtmlUnit-x.y.z $ sudo python setup.py install

The test folder of the distributed package contains an example of use and the doc folder contains a discussion about using PyHtmlUnit for testing CherryPy web applications.

HTML documents are represented as HtmlElement trees. Such trees can be queried in order to check the existence of elements and the values of attributes.

PyHtmlUnit is intended to be used in conjunction to unittest or any other testing framework for writing assertions.


>>> import htmlunit
>>> page = '''
... <html>
... <body>
... <div id="main">
... <h2>Create New Person</h2>
... <form id="person_form" method="post" action="/person/create">
... <table>
... <tr>
... <td>Firstname:</td> <td><input type="text" name="firstname" /></td>
... </tr>
... <tr>
... <td>Lastname:</td> <td><input type="text" name="lastname" /></td>
... </tr>
... </table>
... <input type="submit" name="ok" value="Create" />
... </form>
... </div>
... <div id="sidebar">
... <a href="/index">Home</a>
... </div>
... </body>
... </html>
... '''
>>> html = htmlunit.parse(page)
>>> form = html.get_html_element_by_id('person_form')
>>> form.get_attribute('action')
'/person/create'
>>> form.content[1].tag
'input'
>>> form.content[1].get_attribute('name')
'ok'
>>> inputs = form.get_html_elements_by_tag('input')
>>> len(inputs)
3
>>> inputs[2].get_attribute('type')
'submit'
>>> page1 = '''
... <html>
... <body>
... <p>A text.</p>
... </body>
... </html>
... '''
>>> page2 = '<html><body><p>A text.</p></body></html>'
>>> htmlunit.parse(page1) == htmlunit.parse(page2)
True

The module also provides a function to test that an HTML page follows a given structure. The page is expected to contain all the elements defined in the template page and can also define more childs. In the following example, page1 conforms to the template tmpl, as it has both div, and the proper headers, and paragraph. On the other hand, page2 does not conform to the template as it does not contain the h2 header in div main.


>>> import htmlunit
>>> tmpl = '''
... <html><body>
... <div id="header">
... <h1></h1>
... </div>
... <div id="main">
... <h2></h2>
... <p></p>
... </div>
... </body></html>
... '''
>>> page1 = '''
... <html><body>
... <div id="header">
... <h1>My Blog</h1>
... </div>
... <div id="main">
... <h2>Entry for today</h2>
... <p>The sun is shining.</p>
... </div>
... </body></html>
... '''
>>> page2 = '''
... <html><body>
... <div id="header">
... <h1>My Blog</h1>
... </div>
... <div id="main">
... <p>The sky is blue.</p>
... </div>
... </body></html>
... '''
>>> tnode = htmlunit.parse(tmpl)
>>> pnode1 = htmlunit.parse(page1)
>>> pnode2 = htmlunit.parse(page2)
>>> htmlunit.page_conforms_to_template(pnode1, tnode)
True
>>> htmlunit.page_conforms_to_template(pnode2, tnode)
False

Download

PyHtmlUnit 0.3.0 source archive

PyHtmlUnit 0.3.0 Egg