Monday, August 7. 2006MySQL Connection Management in PHP - How (Not) To Do ThingsI'll warn you right now, this is going to be a long article. More than likely, I'll put a version of this up on the MySQL developer zone and PHP zone. This article is intended to highlight various basic topics concerning proper methods of handling connections to MySQL databases in PHP, guidelines for caching dynamic content, and a technique called "lazy loading". Hopefully by the end of the article you'll have learned how to combat a very widespread and potentially devastating scalability problem seen in an enormous number of PHP web applications. An introduction to the problemBefore I start the discussion on connecting to MySQL servers via PHP, it's worth pointing out that the relative cost of connecting to a MySQL database, versus connecting to a PostgreSQL or Oracle installation, is very, very low. However, the fact that connecting to a MySQL resource is inexpensive does not mean that connection resources can be abused. Anyone who has ever seen the dreaded "Too many connections" error which occurs when MySQL attempts to service more connections than the number of concurrent connections afforded by the max_connections configuration variable, knows what I am talking about. Connecting to a MySQL resource, while inexpensive, requires PHP to issue a call to the mysql client API via mysql_connect(). Passed as variables to this function are the connection arguments for the database host, user, password, etc. Once the MySQL database server has received the connection information, it will either return a success value, or a failure. Upon failure, the mysql_error() function can be used to determine what went wrong during the connection attempt (typically user credential problems or the max connections issue). So, where, you ask, is the problem? Well, the issue that I commonly see is that connections are made to MySQL resources when they do not need to be made. But, you say, almost all web applications serve dynamic content, so therefore doesn't dynamic content virtually require a connection to a database be made? Well, not really in many, many cases
Let's take as an example a very, very popular PHP web application installed on hundreds of thousands of servers worldwide: the blogging application Wordpress. Now, before I go any further, I want to say that the reader should not think that I am attacking Wordpress in this article, or deliberately trying to point out shortcomings in their software. By contrast, I picked Wordpress to demonstrate that the problem described in this article is widespread among PHP web applications. At the very end of the article, I'll present a patch to the current Wordpress source code which fixes the issue identified in this article. I will post the patch to the wp-testers mailing list after I complete the article. Promise. Now, a blogging application is indeed a data-driven web application. Typical blog software involves the posting of articles, the management of comments, the display of such articles and comments, and, of course, pages which serve to provide the various RSS and Atom feeds for the blog. So, one might argue that blogging software is highly dynamic, and therefore would necessarily issue calls to connect to the database upon every visit to the blogsite. This, however, is not particularly true. Even on extremely busy blogs, content doesn't change on a continual basis. This point is even more relevant when you consider that after a certain number of days after an article is posted, content becomes almost entirely static. Keep this point in mind as you follow through the next sections, which walk through the page invocation process which Wordpress executes upon every hit to a blog page (including feed pages). Investigating Wordpress page invocationOn all PHP pages in the Wordpress main directory — which include index.php, wp-atom.php, wp-rss, etc. — each file begins with the following include: require(./wp-blog-header.php);The wp-blog-header.php page does a couple things but mostly serves to include the following: require_once( dirname(__FILE__) . '/wp-config.php');OK, so we're heading over to wp-config.php... and we find some good stuff, such as the defines for the database connection parameters, and:
define('ABSPATH', dirname(__FILE__).'/');
require_once(ABSPATH.'wp-settings.php');
OK, so a quick :find wp-settings.php later, we open up the first meaty file of our page invocation. The first 72 code lines of wp-settings.php do some housekeeping stuff, like checking if the PHP version is adequate and if the MySQL extension is installed (tangent: is it really necessary to do this on every web page invocation?!). After that, we see the following include:
define('WPINC', 'wp-includes');
require_once (ABSPATH . WPINC . '/wp-db.php');
Into the heart of the beastOK, so thus far there's been nothing spectacular or extraordinary about the code. Just a few includes to make file maintenance orderly, but nothing unusual. However, the wp-includes/wp-db.php file contains perhaps the most common PHP/MySQL gotcha seen in today's web applications. The file starts out with some defines and then the class definition of the Wordpress database abstraction object, called wpdb. The wpdb class contains the very typical methods commonly seen in a DB abstraction layer: get_col(), get_row(), get_results(), etc, which allow a query string to be passed in and executed against a MySQL database. However, there is one major problem in the design, which manifests itself in the last line of the file: $wpdb = new wpdb(DB_USER, DB_PASSWORD, DB_NAME, DB_HOST); Well, what's so wrong with that, you say? Well, what does the new operator do? It creates an object of a class type specified and, during the creation of the object, calls the class constructor, which is the class method with a name identical to the class name — in this case, the method called wpdb, shown here:
// ==================================================================
// DB Constructor - connects to the server and selects a database
function wpdb($dbuser, $dbpassword, $dbname, $dbhost) {
$this->dbh = @mysql_connect($dbhost, $dbuser, $dbpassword);
if (!$this->dbh) {
$this->bail("... content snipped ... ");
}
$this->select($dbname);
}
Can anyone tell what is happening on every page invocation to a Wordpress blog? Yup. A connection is being made to the underlying MySQL database. On a heavily hit blog site, this code can easily lead to the dreaded too many connections error because a connection to the database is being made even for mostly static content. There are a couple ways to combat this problem: Lazy loading and Content Caching. These two techniques can be used together to eliminate a huge portion of the database queries and connections in typical web applications. Lazy LoadingThe code in the wpdb class isn't fundamentally wrong. It just needs some tweaking to ensure that a connection to the database is only made if a query is executed against the database. A technique called lazy loading essentially delays the connection to the database until the last minute, instead of upon creation of the database abstraction object. The MySQL Forge database abstraction layer uses lazy loading in just this way. The name of the class is SqlConnection, and it has an empty constructor. Instead of connection logic embedded in the constructor, the object has a Connect() method, which looks like the following. The code has been modified only to remove the logic which automatically handles master/slave replication switching:
/**
* Attempt to connect the resource based on supplied parameters.
*
* @return boolean
* @access public
*
* @param string (optional) Host name (Server name)
* @param string (optional) User Name
* @param string (optional) User Password
* @param string (optional) Database Name
*/
function Connect() {
if (func_num_args() == 4) {
// A different database has been requested other than the
// standard global config settings
$host = func_get_arg(1);
$user = func_get_arg(2);
$pass = func_get_arg(3);
$dbname = func_get_arg(4);
}
else {
$host = SQL_HOST;
$user = SQL_USER;
$pass = SQL_PASS;
$dbname = SQL_DB_NAME;
}
/**
* Short circuit out when already
* connected. To reconnect, pass
* args again
*/
if (is_resource($this->_Cnn) && func_num_args() != 4) {return true;}
if (! $this->_Cnn = mysql_connect($host, $user, $pass)) {
trigger_error(get_class($this) .
"::Connect() Could not connect to server: " .
mysql_error(), E_USER_ERROR);
return false;
}
else {
if (! mysql_select_db($dbname, $this->_Cnn)) {
trigger_error(get_class($this) .
"::Connect() Could not connect to specified database on server: " .
mysql_error(), E_USER_ERROR);
return false;
}
else {
return true;
}
}
}
The _Cnn member variable is a reference to a MySQL database resource that is returned upon a successful call to mysql_connect(). Notice that the Connect() method has logic which ensures that if an existing connection has already been made during page execution, then the Connect() method simply returns true. You may be surprised to find out that, just like Wordpress, the MySQL Forge software creates a database abstraction object upon each call to the MySQL Forge website. The following code is included in all page invocations: /** * Fine to establish a global connection * here, since connect doesn't occur until * SQL execution. */ require_once(DIR_LIB . 'class/SqlConnection.php'); $GLOBALS['Db'] =& new SqlConnection(); The difference is that a connection to the database is not made in the SqlConnection class constructor, so having the object instantiated doesn't consume database resources, unlike the wpdb class. So, if the constructor doesn't call Connect(), then when exactly will mysql_connect() be called? Here, we see the Execute() method of SqlConnection:
/**
* Executes the supplied SQL statement and returns
* the result of the call.
*
* @return bool
* @access public
*
* @param string SQL to execute
*/
function Execute( $Sql ) {
/* Auto-connect to database */
if (! $this->_Cnn) {
$this->Connect();
}
if (!$this->_Res = mysql_query($Sql, $this->_Cnn)) {
trigger_error(get_class($this) .
"::Execute() Could not execute: " .
mysql_error() .
" (SQL: " . $Sql . ")", E_USER_ERROR);
return false;
}
else {
return true;
}
}
In the Execute() method, you can see that if the _Cnn member variable is not set (meaning, a previous connection has not been made to the database), then the SqlConnection connects, otherwise, it simply executes the supplied string against that connection via the mysql_query() function and stores the returned result resource in the _Res member variable for use by other methods in the class. Other methods of SqlConnection simply wrap the Execute() call and provide result sets in various forms. What this means is that on page evocations to MySQL Forge, unless dynamic data is actually needed, no connections to the database are actually created. Which leads us nicely to the other technique for handling semi-dynamic content web requests: Content Caching. Content CachingCaching is perhaps the most fundamental concept discussed in the field of computer sciences when it comes to performance of both hardware and software. A cache, simply defined, is a storage area for data that has been parsed, retrieved, calculated, or otherwise generated in an expensive operation. The cache functions to alleviate the need for various resources to regenerate the cached data upon every request for the data. Caches exist everywhere in both hardware and software. For instance, on a hardware level, modern CPUs usually have at least two levels of hardware caches (usually called the L1 and L2 caches). These CPU-connected fast-access caches exist so that the CPU does not need to call a kernel-level RAM memory page access call, which is a relatively expensive operation since the speed of access to a RAM page is much slower than the access speed to the locally connected Lx caches. When speaking about caches, it's important to recognize that everything is relative to something else. Accessing a hard disk is much more expensive than accessing a page of RAM, which is much more expensive that accessing a line of bytes stored in the L1 cache. Likewise, in application-specific caches (which we'll be talking about next), the relative cost of accessing cached data is lower than retrieving the same information from the MySQL database. So, let's talk a bit about basic content caching for a PHP web application. Although these examples use PHP, the discussion of application caching applies to all languages. Every web scripting language provides similar functionality to implement caching. Application content caching occurs when a standard call to the database is replaced with a call to an application content cache. In these examples, we'll implement a simple file-based cache; other solutions are, of course, available, including using memcached or a static content web server proxy to serve web content. Wordpress actually implements its own caching mechanism, called ObjectCache. You can take a look at the implementation in the wp-includes/cache.php. However, this implementation has a couple design limitations that make it unsuitable for a discussion on general caching. It uses a tight coupling with other Wordpress functions and objects, which makes the caching mechanism unfriendly for general re-use. A Simple File Cache EngineBefore we get into the implementation of the CacheEngine class that MySQL Forge uses, let's first take a look at some code from the /lib/class/ProjectMemberFinder.php class that handles requests to retrieve information about the members involved in a project listed in the MySQL Forge project directory:
/**
* Return project members based on project ID value
*
* @return array
* @param int project ID
*/
function &GetByProjectId($Project) {
/**
* ProjectMembers don't change that often,
* so cache the output of these calls.
*/
$cache_id = 'project_members-' . $Project;
if ($cache = $GLOBALS['CEngine']->GetFromCache($cache_id, $Seconds=0, $IsObject=true)) {
return $cache;
}
$sql = "SELECT
pm.project
, pm.member
, fu.display_name
, pmr.description as role
, pm.can_write
, pm.can_read
, pm.joined_on
, pm.last_source_login
, pm.last_source_commit
FROM " . $GLOBALS['SqlTables']['ProjectMember'] . " pm
INNER JOIN " . $GLOBALS['SqlTables']['ForgeUser'] . " fu
ON pm.member = fu.user_id
INNER JOIN " . $GLOBALS['SqlTables']['ProjectMemberRole'] . " pmr
ON pm.role = pmr.project_member_role_id
WHERE pm.project = " . (int) $Project;
$results = $GLOBALS['Db']->GetRecords($sql);
$GLOBALS['CEngine']->WriteToCache($cache_id, $results);
return $results;
}
OK, so the first thing you will notice is that there's a comment saying basically, "look, this information really doesn't change all that much. Let's go ahead and cache the results of the database query for later re-use". We first ask the global CacheEngine object ($GLOBALS['CEngine']) if we have a cached version of the supplied Project's project members list:
if ($cache = $GLOBALS['CEngine']->GetFromCache($cache_id, $Seconds=0, $IsObject=true)) {
return $cache;
}
The GetFromCache() method of the CacheEngine class returns the requested data, or FALSE. So, in the above code, we simply return the cached data if it is available in the cache. The $Seconds argument to the GetFromCache() method is simply the number of seconds that the cached data should be considered valid. Passing a zero as this argument means we always consider the data valid. The $IsObject argument tells the CacheEngine to return the cached data as an array or an object. We'll see how this is implemented in a little bit. OK, so if the cached data does not exist in the cache, the ProjectMemberFinder::GetByProjectId() method continues on to request the data from the underlying database. The global Db abstraction layer object (described earlier) has its GetRecords() method called, with a SQL string passed as a parameter: $results = $GLOBALS['Db']->GetRecords($sql); It is the next line of code that facilitates the caching of this data in our content cache: $GLOBALS['CEngine']->WriteToCache($cache_id, $results); So, upon the first invocation of the GetByProjectId() method of ProjectMemberFinder, for each unique supplied Project ID value, we issue a request to the database and then cache the results for each subsequent call to the function. This saves us an enormous amount of database interaction, increasing the overall scalability of the system since the software can handle more concurrent requests, since the database connection will no longer be a bottleneck to the system. There are a couple cases that we need to handle when processing cache requests, including how to invalidate data in the cache. We'll get to these cases in a minute. First, let's take a look at the CacheEngine class' two main methods: WriteToCache() and GetFromCache(). The GetFromCache() MethodAs you saw above, the GetFromCache() method takes three arguments and returns either FALSE, or the cached data. Let's take a closer look at the CacheEngine::GetFromCache() method.
/**
* Retrieves a Cache file and returns either an object
* or a string
*
* @return mixed
* @param string Name of File in Cache
* @param int Number of Seconds until File is considered old
* @param bool Return an object from Cache?
* @access public
*/
function GetFromCache( $FileName , $Seconds = 0, $IsObject = false) {
$this->_BuildFileName($FileName);
$return = false;
if ($Seconds == 0) {
if (file_exists($this->_CacheFilePath)) {
$return = $this->_ReadFromCache();
}
else {
return false;
}
}
else {
$refresh_time = time() - (int) $Seconds;
if (filemtime($this->_CacheFilePath) > $refresh_time) {
$return = $this->_ReadFromCache();
}
else {
/** Cached data not valid, remove it */
$this->RemoveFromCache($FileName);
return false;
}
}
if ($IsObject) {
$return = unserialize($return);
}
return $return;
}
The GetFromCache() function should be fairly easy to follow. The meat of the function lies in either checking that the file exists (if there is no time limit on the cached entry) with the PHP file_exists() function or checking the modification time of the file using the filemtime() function otherwise. The $IsObject flag argument simply runs an unserialize() on the data coming back from the internal _ReadFromCache() method, which we will look at next:
/**
* Reads the local file from the cache directory
*
* @return mixed
* @access private
*/
function _ReadFromCache() {
$mq_setting = get_magic_quotes_runtime();
set_magic_quotes_runtime(0);
if (!$return_data = @ file_get_contents($file)) {
trigger_error(get_class() .
'::_ReadFromCache(): Unable to read file contents'
, E_USER_ERROR);
}
set_magic_quotes_runtime($mq_setting);
return $return_data;
}
This function handles reading the cached data from a cache file. The magic quotes runtime (perhaps the most annoying PHP feature ever) is turned off before reading the file to prevent automatic escaping of certain characters in the data, and then turned back to its original setting immediately after. So, the reading of a cache file is fairly simple. Let's take a look at the write mechanism of the CacheEngine class. This code is adapted from a technique which George Schlossnagle details in his excellent read, "Advanced PHP Programming" (Developer's Library, 2004) called file swapping. The technique facilitates lock-less writing of a cache file while allowing for simultaneous read requests of the cache file. Let's take a look:
/**
* Writes data to the cache
*
* @return mixed
* @param string File Name (may be encoded)
* @param mixed Data to write
* @access public
*/
function WriteToCache( $FileName, $Data ) {
if (is_array($Data) || is_object($Data)) {
$Data = serialize($Data);
}
$this->_BuildFileName($FileName);
/**
* Use a file swap technique to avoid need
* for file locks
*/
if (!$file = fopen($this->_CacheFilePath . getmypid(), "wb")) {
trigger_error(get_class($this) .
'::WriteToCache(): Could not open file for writing.'
, E_USER_ERROR);
return false;
}
$len_data = strlen($Data);
fwrite($file, $Data, $len_data);
fclose($file);
/** Handle file swap */
rename($this->_CacheFilePath . getmypid(), $this->_CacheFilePath);
return true;
}
The code above opens a temporary file for writing. Notice that the actual cache file name is appended with the getmypid() function output, essentially making the filename a temporary, unique filename. Then, given the fopen() call was able to open the file for writing, the data is then written to the file, and the file closed. Finally, the rename() function is used to change the temporary filename to the actual cache filename. Because the rename() function simply acts on the file inode (a structure which stores information about the file, not the file contents itself, the rename() operation is a) very quick, and b) allows other processes to read from any existing cache file with the existing name without blocking the rename operation. Cache invalidationOK, so our CacheEngine class now has most of the functionality needed to effectively cache data from the database. However, we still need a method of removing old cache data files. Hence, the very simple RemoveFromCache() method:
/**
* Removes a cache file from the cache directory
*
* @return mixed
* @param string File Name to remove (will be encoded)
* @access public
*/
function RemoveFromCache( $file_name ) {
$this->_BuildFileName($file_name);
if (!file_exists($this->_CacheFilePath)) {
return true;
}
else {
if (!unlink($this->_CacheFilePath)) {
trigger_error(get_class($this) .
'Unable to remove from cache file'
, E_USER_ERROR);
return false;
}
else {
clearstatcache();
return true;
}
}
}
All the above function does is remove the cached file if it exists. So, when would we call this function? Well, if you were paying attention earlier, you wold have already seen one occasion. In the GetFromCache() method, the RemoveFromCache() method is called when the modification time of the file exceeds the expiration time supplied to that function. Additionally, the RemoveFromCache() method would be called when we manually want to remove a cache file, for instance, if the list of project members changes, as this snippet from the /lib/class/ProjectMember.php class illustrates. The snippet comes from the ApproveMembershipRequest() method of that class:
...
/* Remove from review queue */
$sql = "UPDATE ... "
if ($GLOBALS['Db']->Execute($sql)) {
$GLOBALS['Db']->Commit();
$GLOBALS['Cache']->RemoveFromCache('project_members-' . $Project);
return true;
}
...
As you can see, when an approval is made, the database is updated and the cache for the project members is removed, so that upon the next request for this project's members, the cache file will be regenerated, ensuring valid, up-to-date data. SummarySo, this article has been an exploration into some simple steps that you can take in order to increase the scalability of your web applications by lazy loading and content caching techniques. Below are the full class files for the CacheEngine class and the SqlConnection class used in the MySQL Forge application. Feel free to use as you wish. Additionally, a patch to the Wordpress source code to enable lazy loading is included below. As always, comments and suggestions for this article are more than encouraged and appreciated. Trackbacks
Trackback specific URI for this entry
No Trackbacks
Comments
Display comments as
(Linear | Threaded)
PEAR::MDB2 also supports lazy loading for connections. This is an increasingly important feature as more and more AJAX websites reduce the amount of work per request, by letting the number of requests skyrocket. Caching as described above is the only way to really get a handle on things, but you still probably want to centralize certain core aspects of your application like creating the DBAL instance.
Interesting article.
From the WordPress standpoint, there's not much use for the average person to have 'lazy loading' support. 1. If you are running advanced-cache systems like WP-Cache or Staticize, the wpdb object should never get loaded (if I remember the code correctly). If it does, it's still needed, period. 2. If you are running the internal WordPress 'object cache' system, which only advanced admin-type users should do (in combination with a memcached or other caching replacement), THEN you obviously aren't an 'average user'. Anyway, query caching to disk is NOT a solution for the average user, on the average shared host -- at least not for WordPress. Generally, the disk usage is too high, you take a hit on the disk accesses versus MySQL having stuff in RAM. However, a generalized caching system, versus a particular caching system, is two different beasts. For instance, you are a plugin writer, who knows there's a particular query whose cost is incredibly high -- and YOU can decide to cache results to disk, and expire them, as needed to balance performance overall. But doing that on a singular case works, because you've determined the expense outweighs the potential disk hit on shared/bad environments. 3. However, here's the real gotcha. You're not just running WP, you're running a forum. Maybe an ad-server. So now, when you load your WP page, you're not creating ONE connection, you are creating three (or more!), one per 'system' that does its own access. I'd be willing to bet that's more of a problem for bigger shared sites... Anyway, food for thought. -d Waiting to establish the connection until you're about to execute a SQL statement is a reasonable performance optimization, but troublesome from a security perspective. Specifically, mysql_real_escape_string() (http://us3.php.net/manual/en/function.mysql-real-escape-string.php, http://dev.mysql.com/doc/refman/4.1/en/mysql-real-escape-string.html) requires a previously established connection to the database to operate correctly (on non-latin1 values assuming default configuration).
While addslashes() doesn't require this, it does not protect you against SQL injection attacks when dealing with non-latin1 values. One might imagine than many WordPress sites (as do many PHP sites in general) deal with UTF-8 data exclusively. Therefore, they need a solution which works for that as well. The obvious solution is to wrap the mysql_real_escape_string() and establish a connection to the database if one doesn't already exist in the Escape() function of your wrapper (and change the other functions to use it.) This would result in connections at query build time, not execute time, meeting the security requirement and preserve the performance advantage. But, in order to be secure, and to operate without warnings, you'd have to ban the use of mysql_real_escape_string() throughout your code base. While that's possible to do in a small development project, it's harder to do in the open source development model and in large code bases. (One thing that would be REALLY helpful is PHP language support to control the visibility of ext/ sourced functions.) Given the fact less that 100% compliance to using the wrapper results in a security hole, one can imagine why a WordPress developer might make the trade-off to always establish a connection at the top of their code - especially because they allow extensions via a plug-in API. /me raises hand again ...
Obviously you need to establish a connection once you need it, which is exactly what MDB2 does. So the idea is to establish a connection as late as possible. Do note however that this has the annoying drawback that if you have a connection issue you may already be deep into some module instead of your front controller or whatever detecting the failure early. But then your exception handling should be good enough. What happens if your db went down between you init'ing your db connection and you running a query....
I'm not convinced your point is one that you wouldn't be dealing with anyway. It's slightly different. We're not talking about a server going down once in a blue moon, and then completely down, but randomly-overloaded connections happening frequently over some period of time. So yes, I'd rather be able to present something earlier saying 'we're experiencing an overly high load', and show some kind of static homepage snapshot or something.
Hi Jay
Interesting article. Here are some tips: Use connection pooling. It can improve the performance a lot, but it can be tricky to implement. I suggest to use an already made PHP implementation. File caching can be quite expensive on high loads. Memory based caching is a lot faster. 4mir Hi!
The point is that file-based caching removes the need entirely to use a database connection at all... Anyway, I agree with you on the memory-based caching; yes, many times it can be faster, but this article was not about memory based caching. See Peter Z's recent article about other caching techniques for some more aspects: http://www.mysqlperformanceblog.com/2006/08/09/cache-performance-comparison/ -jay It would probably be a lot easier to use mysql_pconnect. PHP when run as an apache module can reuse connections. The connections persist between scripts, not just within the script. It’s fast and simple.
-Paul http://www.php.net/manual/en/features.persistent-connections.php This, IMHO, is a terrible idea. Persistent connections should not be used in PHP, as the connection will stick around in the apache child process, regardless of whether that process will be used or not. MySQL is built to handle very short, quick connections to the database, not persistent connections, and there are known scalability problems with using persistent connections.
See: http://www.databasejournal.com/features/mysql/article.php/3599166 Quote: "One of the ongoing debates floating around has been that around the merits of mysql_connect as opposed to mysql_pconnect. No longer! There is no function for calling persistent connections [in the new mysqli extension], as the consensus seems to have been that on high-volume sites, they did more harm than good, and there is no need for them on low-volume sites. The connection resource for MySQL has always been quite light." You can also declare the database object as static variable, so that it extends beyond the lifetime of a single call. That way, if you open up a connection in the header, you don't end up re-opening a connection in the footer:
http://www.bellaonline.com/articles/art29070.asp The database object is a global object, thus last the length of the page call.
Just to quickly respond to the last three:
- pooled connections don't usually working properly in a shared server environment (at least, that was my experience two years ago -- might have changed since then). - yes, file caching is generally bad in shared servers because of already high disk access time, and in any server environment is worse than a cache to ram technique (see my original comment...). - WP declares its wpdb object early on, and then it exists throughout the entire execution of the page. One could say that's wasteful, that shutting down the connection once it's no longer really needed would be better. Depends on actual page-compute time I guess. While I don't disagree with any of the things you've pointed out here in reference to wordpress. It's a simple question of need for optimization. Have Wordpress users or testing found this to be a performance bottleneck more than other aspects of the software? Not necessarily. In fact, only real performance testing can tell you where best to apply your optimizations. I fully agree with the priciple and on the surface, implementation of these optimizations and if you've come to a point where they are causing you performance slowdown then I'd suggest you go with them. However, I don't necessarily agree with the "this code isn't as efficient as it could be, therefore optimize it" as the sole motivator.
Great post! We definitely had a problem with that at work -- when every page includes 3-5 other pages, and each of those calls the database connection. And surprisingly, it wasn't a big issue, but we certainly fixed it when we noticed it.
I think the potential for a problem is a good enough reason to fix an application -- when you know your application will probably scale that high. In Wordpress' case, they have no idea who's using it for what, so they should definitely fix it, because someone might download it to use it in a highly scaled environment. In other words, if you say "well, 99% of the time, and for 99% of the people, this application is just fine" then what happens to the other 1%? You want to make sure that your software/application doesn't break then.... I fully agree with the priciple and on the surface, implementation of these optimizations and if you've come to a point where they are causing you performance slowdown then I'd suggest you go with them. However, I don't necessarily agree with the "this code isn't as efficient as it could be, therefore optimize it" as the sole motivator.
qfrxtgs hnsgcy bodyun qdjmekgc gjvaxcok fbsgyl yvuwgtl
topelxdc sfjtkg upkjsh wivpgnh svxnplrcy wpjmcqx fbhspwgja http://www.htnyidfeq.jlitp.com
tewbpk cweafxl igmoufw lwqazye jtdx wacebnfq zlreyg qrsneuphb ixyeq
pextrzu vyjixuwd drouvqt olkwncbhj ovwznlc fkgbms dginevrw [URL=http://www.msewvc.tmxcyj.com]pxsdjuae esodzrqc[/URL]
btxpi xesbqp roxtmfw kmzsxc tmiqgd vzlbua hdlv [URL]http://www.arpfy.hewjtxpvb.com[/URL] olzf atvim
ki3muzl63cz2 [URL=http://www.915471.com/657426.html] uf8xj61r [/URL] sqyl7p5j
mowp sxnzwl cneurjoga asmfqhyg romyzkxdw tpeazfox dcghusak
lnvcuyxb ymel iqafebm dajm tdpajzo rilqab xcthnaokp http://www.eldmk.ykbgsuq.com
egcxv hpbra urae tbhxq bnuymtc wubgvr adqnmh jnwbeoyt fwnktx
jnsxg zpivgqbt vthowgk rdcohvl raduswo pvjzxo kogpy [URL=http://www.nyopbsth.zhvk.com]smtunier vhrmwnj[/URL]
vexkw zuxkrlemt crmqgnl drvqnabzy aunqlo dgtqiy vsiqtx [URL]http://www.kjlqdp.owsbg.com[/URL] oqlstekh msqx
http://btjobf10.com btjobf11 btjobf12 [URL='http://btjobf13.com' ]btjobf13[/URL] btjobf14 [URL=http://btjobf15.com ]btjobf15[/URL]
Firstly, a very good article. Thank you. I recently experienced my first "Too Many Connections" issue and have converted to a lazy load scheme.
While this has reduced the erros I still get them. My system is a little different than a blogging system in that I sell items and need to know that items are still available before someone buys one. So caching is not an option. What has worked really well is: 1) aggregating calls. Try to limit the number of actual queries you need to make and aggregate them into as few selects as possible. 2) get data first. Do your DB work as the first thing a page does. That way you can wrap the connection around the very top of the page and release it as soon as possible. Shaving milliseconds off has actually paid dividends. 3) although I said caching is not an option... well you can cache up to the point when someone actually tries to buy the thing. So the majority of views will be cached, and only when people are buying will it hit the DB. You can even 'invalidate' the cached page if someone tries to buy and there's none left. What I'm trying to say is - have a variable cache policy. Like an annotation in Java, put a variable at the top of a page which indicates what policy should be applied to that individual page. That way you can have a more flexible approach to caching depending on the type of data and the amount of traffic. Anyway, I hope my insights can help. |
Calendar
QuicksearchArchivesCategoriesSyndicate This Blog |
||||||||||||||||||||||||||||||||||||||||||
