Monday, June 18, 2007

Wikipedia's MySQL databases handle over 25,000

I got this in mail from Vibek dai. We had small chat about open source software and license software yesterday. This is quite a useful article to all.


Wikipedia's MySQL databases handle over 25,000

Wikipedia is the multilingual, Web-based, free encyclopedia that is
produced collaboratively by volunteers. According to Alexa Traffic
Rankings, Wikipedia consistently ranks in the Top Ten most-visited Web
sites in the world. It hosts over 5 million articles in more than 100
languages. Every day, tens of millions of visitors learn more about
their world - making nearly a half-million edits and creating
thousands of new entries.

Their 'Growing' Challenge :

Wikipedia's growth statistics are simply amazing. The organization has
faced exponential growth on many fronts since its introduction in
2001. These include:

* Annual visitor growth from less than 50,000 to over 154 million
* Content growth from less than 100 articles to over 5 million
* Contributor growth from less than 100 to over 290,000

Wikipedia expects the growth in content, contributor and user-base to
continue in all directions - and needs a computing infrastructure that
will keep the pace.

Their Scale-Out Solution:

This phenomenal growth has put constant technical pressure on the
performance and scalability of the system. Wikipedia is based on the
LAMP stack (Linux, Apache, MySQL & PHP) and has grown from initially
employing a single shared server to now being a Top Ten site, with
more than 20 replicated database servers delivering up-to-date content
to visitors. Additionally, lightweight MySQL instances are spread out
on application servers as a distributed archive solution.

Wikipedia relies upon MySQL replication to scale-out their database
infrastructure and accommodate more visitors, more articles and more
contributors. This architecture also allows them to save significantly
on hardware costs. Since they add new servers only on an incremental,
as-needed basis, they can delay their new hardware purchases until
more powerful machines drop to lower, commodity prices.

What is Database Scale-Out?

Scale-Out is a modern computing architecture that enables
organizations to improve application performance and scalability on an
incremental, as-needed basis by adding multiple replicated database
servers on low-cost commodity hardware. This is in contrast to a Scale-
Up approach, which requires organizations to make a large up-front
investment in more expensive and complex server hardware and database
licenses in order to add capacity.

In the online world, many of the largest and fastest-growing companies
use MySQL to cost-effectively Scale-Out their successful businesses,
saving millions of dollars over high-cost proprietary technology -
including Google, Yahoo, craigslist, Ticketmaster, Wikipedia, YouTube,
and Evite/Citysearch.

"The notion persists within many traditional enterprises that once you
reach a certain level of application importance, it is necessary to
transition to big, expensive boxes running big, expensive databases.
However, free-thinking members of their IT staffs are beginning to ask
the question: 'What can we learn from Google, Yahoo, and Wikipedia on
how to scale for high growth?' "
Stephen O'Grady, Principal Analyst
Red Monk


No comments: