The largest database in the world — Yahoo! And it works on PostgreSQL.

Yahoo!Yahoo utverjdaet, she managed to break a world record by creating the largest and most loaded database in the world!

The volume launched a year ago the database has reached 2 petabytes. The system is designed for analytical purposes, it holds the history of the behavior of web users (it is alleged that in the month the data is stored on the half a billion users). In addition, the Internet giant declares that it is not only the largest database in the world, but also the most loaded in the day it records information about 24 billion events.
Postgres!
And now the most interesting. Controls this monster is a modified PostgreSQL. This is the result of buying startups Mahat Technologies, initially working with the most advanced, open source database system PostgreSQL. Code "Postgres" has been modified to work with such huge volumes of information (one of the major changes: a focus on columnar storage instead of the traditional row-based, which slows down disk writes, but provides the best speed of access to data for analytical purposes). A positive result is obvious: some tables in the database contain trillions of rows that do not simply lie dormant on the disk, but can be queried and processed by standard SQL, the standard ACID-compliant environment.

Yahoo engineers is expected to increase to 5 petabytes by next year. And they are ready for such growth. For comparison: rare DB enterprise-level volume of more than tens of terabytes. For example, one of the largest publicly known database in the world — database of IRS weighs only 150 terabytes. EBay says it works with engines that can process 10 billion rows a day, the total amount of data in these systems is 6 petabytes, and the amount of data from the largest of the systems is approximately 1.4 petabytes.

It is understood that we are talking about DBMS and database built on them. There is a data store with an even more impressive volume, but the data in them is almost inaccessible for analysis and processing. For example, the world data center for climate in Hamburg has in store more than 6 petabytes of data stored on magnetic tape in the "active" state are "only" 220 terabytes of data (which are maintained by the DBMS under Linux, see PDF).

"PostgreSQL continues to grow, confirming the title of the most developed from open DBMS, — says the representative of the company "Postgresmain" Nikolay Samokhvalov. Last year the engineers at Sun have shown the world that PostgreSQL is not inferior in performance to the Oracle. At the recent Canada international conference PGCon2008 NASA representatives spoke about his experience of using PostgreSQL to work with large databases in the field of climate observations. Experience Yahoo — another striking confirmation of the maturity of PostgreSQL. And this is very good news for all of us, it is a pity that, as far as I know, Yahoo has no plans to share its best practices with the community."
Article based on information from habrahabr.ru

Комментарии

Популярные сообщения из этого блога

Monitoring PostgreSQL + php-fpm + nginx + disk using Zabbix

Templates ESKD and GOST 7.32 for Lyx 1.6.x

Customize your Google