viernes, 4 de mayo de 2012

Scraping Web Pages With jQuery, Node.js and Jsdom

I always found it odd that accessing DOM elements with Ruby, or Python, wasn’t as easy as it was with jQuery. Many HTML parsing libraries employ Simple API for XML (SAX) that can handle extremely large XML documents, but is cumbersome and adds complexity. Other parsing libraries use XML Path Language (XPath), which is conceptually simpler than SAX, but still more of an effort than jQuery. I was pleasantly surprised to discover that it’s possible to use jQuery to parse web pages with Node.js. This is accomplished by using jsdom, “a javascript implementation of the W3C DOM”.

Resto del artículo (fuente original)

No hay comentarios:

Publicar un comentario