Atom feed for sandbox Habrahabr

As you know, one of the most convenient ways read periodic topics where possible is the rss feed. A few days ago abrowser rozboris lamented the fact that habr.ru/new it reads through your favorite rss aggregator, and to see what appeared to sandbox habrahabr, you need to use the favourite browser. The idea of creating a feed sandbox seemed attractive to us.

/ > And I wrote the script, by subscribing to which you will be able to read fresh posts from the sandbox.

what to write?


The options are many: python, perl, php. I love writing in python, but I chose php because, if the feed will be popular, interested abrowser be easier to deploy the script on their servers (in my opinion php is more common).

How it works


It's simple. When accessing the aggregator to the script that updates the latest posts from the first page of the sandbox in the following cases: if there isn't one post (maybe the script is run the first time) or since the last update it took more than 20 minutes.

The update is as follows: the script parses topics, highlighting them in the header, links, body of the post, the time of publication. Date and publication time is converted to timestamp atom (Yes, the feed uses atom Protocol), and in the absence of the topic in database with the same timestamp there, he added.

After that, the topics formed the basis of xml, which is returned to the user.

Trivia


To run the script, you need to write code to give you access to your mysql (you will need only one table, whose name you also can configure inside the script). I decided to use a database, not a file lying around with a script, that does not suffer from the possible problem of simultaneous access to this magic file — for example, one script updates the posts in the database, and the other trying to read.

In addition, according to the rules Habra, any bot that reads the contents Habra, should podchinyatsya certain the rules, in particular, to have the right user-agent which contains information about the owners of the bot. So I'm in a hurry to write a page with a small description of the project, our contacts and link to source code.

If you do decide to run the script, then please fill in $user your contact information in case of questions, Habr will be able to reach up to you.

PS


The script is working, but is still raw. I hope you will be useful for my job. Licence — MIT Licence.

Oh yeah, that's link feed!

the UPDATE: In your free time will correct some inaccuracies, to meet the feed specifications, thanks for the comments!
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Monitoring PostgreSQL + php-fpm + nginx + disk using Zabbix

Templates ESKD and GOST 7.32 for Lyx 1.6.x

Customize your Google