Jump to content



 

 
Photo

Facebook trapped in MySQL ‘fate worse than death’


  • Please log in to reply
10 replies to this topic

#1 EmFeld

EmFeld

    part human part machine

  • Moderator
  • 5,178 posts

Posted 08 July 2011 - 11:45 PM

Source: http://gigaom.com/cl...rse-than-death/
Date: Jul. 7, 2011, 1:00pm PT
By: Derrick Harris

Facebook trapped in MySQL ‘fate worse than death’

Posted Image


According to database pioneer Michael Stonebraker, Facebook is operating a huge, complex MySQL implementation equivalent to “a fate worse than death,” and the only way out is “bite the bullet and rewrite everything.”

Not that it’s necessarily Facebook’s fault, though. Stonebraker says the social network’s predicament is all too common among web startups that start small and grow to epic proportions.

During an interview this week, Stonebraker explained to me that Facebook has split its MySQL database into 4,000 shards in order to handle the site’s massive data volume, and is running 9,000 instances of memcached in order to keep up with the number of transactions the database must serve. I’m checking with Facebook to verify the accuracy of those numbers, but Facebook’s history with MySQL is no mystery.

The oft-quoted statistic from 2008 is that the site had 1,800 servers dedicated to MySQL and 805 servers dedicated to memcached, although multiple MySQL shards and memcached instances can run on a single server. Facebook even maintains a MySQL at Facebook page dedicated to updating readers on the progress of its extensive work to make the database scale along with the site.

The widely accepted problem with MySQL is that it wasn’t built for webscale applications or those that must handle excessive transaction volumes. Stonebraker said the problem with MySQL and other SQL databases is that they consume too many resources for overhead tasks (e.g., maintaining ACID compliance and handling multithreading) and relatively few on actually finding and serving data. This might be fine for a small application with a small data set, but it quickly becomes too much to handle as data and transaction volumes grow.

This is a problem for a company like Facebook because it has so much user data, and because every user clicking “Like,” updating his status, joining a new group or otherwise interacting with the site constitutes a transaction its MySQL database has to process. Every second a user has to wait while a Facebook service calls the database is time that user might spend wondering if it’s worth the wait.

Not just a Facebook problem

In Stonebraker’s opinion, “old SQL (as he calls it) is good for nothing” and needs to be “sent to the home for retired software.” After all, he explained, SQL was created decades ago before the web, mobile devices and sensors forever changed how and how often databases are accessed.

But products such as MySQL are also open-source and free, and SQL skills aren’t hard to come by. This means, Stonebraker says, that when web startups decide they need to build a product in a hurry, MySQL is natural choice. But then they hit that hockey-stick-like growth rate like Facebook did, and they don’t really have the time to re-engineer the service from the database up. Instead, he said, they end up applying Band-Aid fixes that solve problems as they occur, but that never really fix the underlying problem of an inadequate data-management strategy.

There have been various attempts to overcome SQL’s performance and scalability problems, including the buzzworthy NoSQL movement that burst onto the scene a couple of years ago. However, it was quickly discovered that while NoSQL might be faster and scale better, it did so at the expense of ACID consistency. As I explained in a post earlier this year about Citrusleaf, a NoSQL provider claiming to maintain ACID properties:

ACID is an acronym for “Atomicity, Consistency, Isolation, Durability” — a relatively complicated way of saying transactions are performed reliably and accurately, which can be very important in situations like e-commerce, where every transaction relies on the accuracy of the data set.


Stonebraker thinks sacrificing ACID is a “terrible idea,” and, he noted, NoSQL databases end up only being marginally faster because they require writing certain consistency and other functions into the application’s business logic.

Stonebraker added, though, that NoSQL is a fine option for storing and serving unstructured or semi-structured data such as documents, which aren’t really suitable for relational databases. Facebook, for example, created Cassandra for certain tasks and also uses the Hadoop-based HBase heavily, but it’s still a MySQL shop for much of its core needs.

Is ‘NewSQL’ the cure?

But Stonebraker — an entrepreneur as much as a computer scientist — has an answer for the shortcoming of both “old SQL” and NoSQL. It’s called NewSQL (a term coined by 451 Group analyst Matthew Aslett) or scalable SQL, as I’ve referred to it in the past. Pushed by companies such as Xeround, Clustrix, NimbusDB, GenieDB and Stonebraker’s own VoltDB, NewSQL products maintain ACID properties while eliminating most of the other functions that slow legacy SQL performance. VoltDB, an online-transaction processing (OLTP) database, utilizes a number of methods to improve speed, including by running entirely in-memory instead of on disk.

It would be easy to accuse Stonebraker of tooting his own horn, but NewSQL vendors have been garnering lots of attention, investment and customers over the past year. There’s no guarantee they’re the solution for Facebook’s MySQL woes — the complexity of Facebook’s architecture and the company’s penchant for open source being among the reasons — but perhaps NewSQL will help the next generation of web startups avoid falling into the pitfalls of their predecessors. Until, that is, it, too, becomes a relic of the Web 3.0 era.


A common mistake startups make...that's why I'm using a scalable engine, just in case :Peace:

regards

-EmFeld-

Edited by EmFeld, 08 July 2011 - 11:47 PM.
typho

  • 0

#2 dcL

dcL

    ‡‡‡ Morning Star ‡‡‡

  • Moderator
  • 3,937 posts

Posted 08 July 2011 - 11:52 PM

wow, to many technical lingo for me :HeHe: I got lost after reading the 3rd paragraph :HeHe:
  • 0

#3 Azeir Lonewolf

Azeir Lonewolf

    R.I.P

  • Members
  • PipPipPipPipPipPipPipPip
  • 7,287 posts

Posted 08 July 2011 - 11:57 PM

:doh: i dont understand with that article.. :p sorry my english is bad

Edited by Azeir Lonewolf, 09 July 2011 - 12:00 AM.

  • 0

#4 TheCriminalCat

TheCriminalCat

    ~Kucing Nakal~

  • Moderator
  • 6,049 posts

Posted 08 July 2011 - 11:58 PM

wow, to many technical lingo for me :HeHe: I got lost after reading the 3rd paragraph :HeHe:


agree... :HeHe:

is it mean.. that FB actually had error since beginning?

azeir.. pls edit your comment bro :HeHe:

Edited by TheCriminalCat, 09 July 2011 - 12:00 AM.

  • 0

#5 EmFeld

EmFeld

    part human part machine

  • Moderator
  • 5,178 posts

Posted 08 July 2011 - 11:59 PM

\
is it mean.. that FB actually had error since beginning?


it means it's a time bomb since the beginning cuz it uses MySQL a database for should-be-non-complex-web-applications and FB has grown into a very complex and very big scaled application now


regards

-EmFeld-

Edited by EmFeld, 09 July 2011 - 12:05 AM.

  • 0

#6 H4cK

H4cK

    BlueFame Flooder

  • Members
  • PipPipPipPipPipPip
  • 1,551 posts

Posted 09 July 2011 - 12:29 AM

Nice article... :)
  • 0

#7 .OYON.

.OYON.

    shadowless sword

  • Moderator
  • 16,128 posts

Posted 09 July 2011 - 03:56 AM

if i'm zuckerberg ... i would say ... "F*CK Meeeeeeee" :axehead:

it's seems there's no way out from this matters
when the basic foundation has fail ... then the rest is "database annihilation"

perhaps at the beginning he didn't expect this so far ... while FB becoming more and more space consuming
mySQL now become a real problem ... while he had to spin his brain to solve this circle of database death :Hypnotized:


perhaps ... he must starting write a new concept here ...
Adding lot of server and extending the bandwidth is just a momentarily solution
a shadow database server must be applied for beginning the migration from MySQL into another proven database that could handle a supermassive volume database ever.
while he eventually moving a real database into a new database system ... the shadow dtbase server will perform a daily routine FB transaction
Until all the migration succeeded, the he could switch into a new database system

well, it's not an easy way to conduct a real database migration with current FB database now
but at least ... rather than he would kill himself ... it's better to plan and act ASAP.

i had no idea which kind of database system would fit into current type of business model like in the FB now
but i think, it has to be a UNIX database system ... a light one, faster, and low bandwidth consumed

maybe @Emfeld had the idea what kind of database would fit there.


Regards,
OyOn
  • 0

#8 EmFeld

EmFeld

    part human part machine

  • Moderator
  • 5,178 posts

Posted 09 July 2011 - 08:09 AM

^^
Use Google Datastore which utilizes the big table algorithm :Peace:

regards

-EmFeld-
  • 0

#9 Elite_Cadre

Elite_Cadre

    BlueFame Typer

  • Members
  • PipPipPipPipPip
  • 1,356 posts

Posted 09 July 2011 - 08:46 AM

What? They still use MySQL? What about Oracle?
  • 0

#10 .OYON.

.OYON.

    shadowless sword

  • Moderator
  • 16,128 posts

Posted 09 July 2011 - 11:53 AM

^^

awww Oracle ....
all IT people called Oracle is Big Problem

Why ? ...
coz even though it could handle large scale database, but at the end, will create the same problem like MySQL

based on my own experience while handling some of my clients that currently using Oracle as their database
it real big headache to handle it. Not to mention, it needs huge resource of servers and bandwidth.


@Emfeld ...
I had no idea either about google ... with their Google Datastore
Since they currently use for searching purpose only,
but eversince Google+ in testing ... i felt a little bit speed decreasing while searching several topic that pop up on my pc.

Before, it was amazingly fast when searching on something.
  • 0

#11 EmFeld

EmFeld

    part human part machine

  • Moderator
  • 5,178 posts

Posted 09 July 2011 - 12:40 PM

^^
Google Datastore is a database engine based on the big table algorithm. It currently runs on Google Application Engine, a platform most FB Game developers(including me) use. I use it for most web apps I wrote lately the performance is wonderful. It can manage more than 100.000 concurent request without problems. Google is slowing down since they are trying to integrate the G+ with real time search. Soon to be resolved :Peace:

Oracle is not a solution, it's a problem for web applications especially for Facebook's scale. Oracle will only run well in a standardized environment, not in a complex environment like the web.

regards

-EmFeld-

Edited by EmFeld, 09 July 2011 - 12:45 PM.

  • 0




BlueFame.Com A Blue Alternative Community for Peace, Love, Friendship, and Cyber Solidarity © 2006 - 2011
All Rights Reserved Unless Otherwise Specified

BlueFame Part Division
BlueFameStyle | BlueFame Upload | BlueFame Radio | BlueFameMail | Advertise here!

BlueFame DMCA Policy