The development of an online store 13000+ products on MODX Revolution. Part 1
I wrote about my component shopModx. And although few appreciate it, as many are waiting for it ready decisions with one big button "Install and operate", however this component is developed taking into account the disadvantages that exist in MODX, and in which often abut MODX developers, and given the advantages that in MODX there, but which the developers don't know or just don't use.
Just want to say that this module is not just being developed. It is being developed under two small store (for a start), and the output will be a break in the platform for the implementation of major online stores.
Today I would like to start a series of articles about the development of major online stores on MODX Revolution, with stories about the challenges faced and any solutions to these problems are used. And also the fact that to solve such problems shopModx will have to carry on Board, and what techniques will allow you to obtain 100% control over the development of his unique store, without getting into code shopModx.
So, a little about the store, on which work is conducted: it is an online furniture store. Yesterday importul base. It turned out 13000+ documents, 43000+ Tusek and nearly 13,000 records in modx_shopmodx_products.
I must say that I expect to get the code page is not even from the cache and search in the parameters less than 1 second, and the average load should not exceed 0.3-0.4 sec.
So, briefly about the first problems and their solutions.
the
To start input data for clear Reva. Specially downloaded a clean 2.3.0 and watched the memory usage. Code stuck in the plug on the event OnWebPageComplite is the most extreme point of implementation for MODX and after exit(), maintaining the cache of the document, etc. the First time (deleted manually all cache files):
By the way, just in case the plugin code: gist.github.com/Fi1osof/5062419
Can be modified to check the rights accesses and to always see the current server load.
In General, check the results on the store (by the way, just want to clarify that the document is not empty, and has 8 related TV-parameters, one of which — the picture with a custom mediasource). The first time
That is, we have increase in memory usage of almost 10 meters immediately. This is because we cached the entire map the URL-s of the context, and we have 13000+ documents. File cache of context — almost 2 meters.
The obvious solution — it is necessary to reduce the file cache of the context. I already wrote
more information about finer points in the MODX caching-and about your patch cacheOptimizer. Put it and disable caching of the resource map to context web. New results:
the
That is, in the normal mode consumes almost the same memory as on the bare system.
the
This problem follows directly from the previous decision :-) as we turned off the caching maps the URL-s, now MODX's URL-have to "understand" what page we are handling with the use of NC. Just to clarify that if you don't have to use CNC, this shouldn't be a problem for you (though who today does not use CNC?), or if you are not a large shop (1000 products), you can not cut off the map pages, the extra megabytes of RAM — no problem.
So, to solve this problem I decided to use my own router. I just wrote a new class that extends modRequest, and slightly tweaked a couple of methods. The logic is as follows: when you access the page, MODX tries to find the id of the resource for the requested URL in the cache. (The URL is already cleaned, that is, without any options, etc.). If it finds one, it returns the ID and then everything happens normally. If not, it attempts to find the document in the database by uri. Finds — writes the id into the cache and then returns the id. If not, then standard procedure OnPageNotFound (so you can still use your plugin to modify the search).
This additional class will be supplied with a shopModx, and if someone need (if it's a big store), then just include it in the settings (key modRequest.class).
There is also a variant from all pages into a cache, for example, when updating the cache (using a plugin for event OnSiteRefresh).
the
Imagine how many have read the previous decision, and thought "what a moron!" :-)
Yes, to produce hundreds of thousands of cache files is complete insanity. But the key word here files. Yes, their status (file) do not give us peace. Because in this case simply use a different cache provider, not a file. I decided to use memcached, as with him already somehow experienced, and installed on the server and you can use another what you wish. In a standard build Revo as well come memcache and APC.
Your choice in favor of cache mechanism on the RAM I argued that simplified reset of the cache. Try from the hard disk to remove 1,000,000 files. It will happen sooooo long. In the case of a memcached cache reset is done simply and quickly.
Another huge plus memcached can store any data types, including objects. The exception is the resources (for example the database connection) and objects one of the properties which have the resources. Such objects should be created with the methods __sleep() and __wakeup () below before storing remove all the properties of the resources, and the recovery of the cache could these properties re-create.
So, look results. The first time
the
I will not explain why, but I needed to change the suffix of the containers. Changed, and the Ajax response didn't wait... went to check the CPU /system/settings/updatefromgrid. Is there such a method checkForRefreshURIs(). In General, if you change "friendly_urls", "use_alias_path" or "container_suffix", then it signals that you need to update the URLs. All right. But the problem is that it tries to update all the documents, indiscriminately, not even containers. In addition, also condition sort by menuindex reason, puts (the order of nesting of interest, not the index menu).
In General, this process made the server cry. Added condition isfolder=1, and then 6 seconds updated all of the containers. To change the suffixes will not :-)
the
In practice, we got a full document processing on site with 13000+ documents (in two tables) and 43000+ TV-shack, less than 0.3 seconds with the cache updated. From the cache is less than 0.1 sec.
It is conditionally possible to consider that at this stage the difference between a large and a small site ends, so as to further brake is only possible on the level of rendering pages, and it depends on how we write templates, etc.
This time I will write in the next article (most likely tomorrow). But I must say that I'm going to do in Smarty, because IMHO to do all this on a clean chunks and snippets — a lot of problems.
And finally, the results from the local test 100 clients, 1000 requests each: gist.github.com/Fi1osof/462e1af10ab7b95311df
Time per request: 44.224 [ms] (mean, across all concurrent requests)
PS Package on modx.com: modx.com/extras/package/shopmodx
Project on Github: github.com/Fi1osof/shopModx
Flooded with the latest version of request-class.
P. P. S. you Specify the memcached provider better right config.core.php (just believe me).
Article based on information from habrahabr.ru
Just want to say that this module is not just being developed. It is being developed under two small store (for a start), and the output will be a break in the platform for the implementation of major online stores.
Today I would like to start a series of articles about the development of major online stores on MODX Revolution, with stories about the challenges faced and any solutions to these problems are used. And also the fact that to solve such problems shopModx will have to carry on Board, and what techniques will allow you to obtain 100% control over the development of his unique store, without getting into code shopModx.
So, a little about the store, on which work is conducted: it is an online furniture store. Yesterday importul base. It turned out 13000+ documents, 43000+ Tusek and nearly 13,000 records in modx_shopmodx_products.
I must say that I expect to get the code page is not even from the cache and search in the parameters less than 1 second, and the average load should not exceed 0.3-0.4 sec.
So, briefly about the first problems and their solutions.
the
Problem 1. A large file cache and lots of memory
To start input data for clear Reva. Specially downloaded a clean 2.3.0 and watched the memory usage. Code stuck in the plug on the event OnWebPageComplite is the most extreme point of implementation for MODX and after exit(), maintaining the cache of the document, etc. the First time (deleted manually all cache files):
Memory: 13.5409 Mb
TotalTime: 0.1880 s
Then:Memory: 10.1396 Mb
TotalTime: 0.0640 s
By the way, just in case the plugin code: gist.github.com/Fi1osof/5062419
Can be modified to check the rights accesses and to always see the current server load.
In General, check the results on the store (by the way, just want to clarify that the document is not empty, and has 8 related TV-parameters, one of which — the picture with a custom mediasource). The first time
Memory: 24.1438 Mb
TotalTime: 0.4360 s
Then:Memory: 18.4103 Mb
TotalTime: 0.0960 s
That is, we have increase in memory usage of almost 10 meters immediately. This is because we cached the entire map the URL-s of the context, and we have 13000+ documents. File cache of context — almost 2 meters.
The obvious solution — it is necessary to reduce the file cache of the context. I already wrote
more information about finer points in the MODX caching-and about your patch cacheOptimizer. Put it and disable caching of the resource map to context web. New results:
the
Memory: 16.1369 Mb
TotalTime: 0.2640 s
Memory: 10.4021 Mb
TotalTime: 0.0720 s
That is, in the normal mode consumes almost the same memory as on the bare system.
the
Problem 2. Page not found (404)
This problem follows directly from the previous decision :-) as we turned off the caching maps the URL-s, now MODX's URL-have to "understand" what page we are handling with the use of NC. Just to clarify that if you don't have to use CNC, this shouldn't be a problem for you (though who today does not use CNC?), or if you are not a large shop (1000 products), you can not cut off the map pages, the extra megabytes of RAM — no problem.
So, to solve this problem I decided to use my own router. I just wrote a new class that extends modRequest, and slightly tweaked a couple of methods. The logic is as follows: when you access the page, MODX tries to find the id of the resource for the requested URL in the cache. (The URL is already cleaned, that is, without any options, etc.). If it finds one, it returns the ID and then everything happens normally. If not, it attempts to find the document in the database by uri. Finds — writes the id into the cache and then returns the id. If not, then standard procedure OnPageNotFound (so you can still use your plugin to modify the search).
This additional class will be supplied with a shopModx, and if someone need (if it's a big store), then just include it in the settings (key modRequest.class).
There is also a variant from all pages into a cache, for example, when updating the cache (using a plugin for event OnSiteRefresh).
the
Issue 3. A lot of cache files
Imagine how many have read the previous decision, and thought "what a moron!" :-)
Yes, to produce hundreds of thousands of cache files is complete insanity. But the key word here files. Yes, their status (file) do not give us peace. Because in this case simply use a different cache provider, not a file. I decided to use memcached, as with him already somehow experienced, and installed on the server and you can use another what you wish. In a standard build Revo as well come memcache and APC.
Your choice in favor of cache mechanism on the RAM I argued that simplified reset of the cache. Try from the hard disk to remove 1,000,000 files. It will happen sooooo long. In the case of a memcached cache reset is done simply and quickly.
$modx- > cacheManager- > getCacheProvider()->flush();
Another huge plus memcached can store any data types, including objects. The exception is the resources (for example the database connection) and objects one of the properties which have the resources. Such objects should be created with the methods __sleep() and __wakeup () below before storing remove all the properties of the resources, and the recovery of the cache could these properties re-create.
So, look results. The first time
Memory: 15.0709 Mb
TotalTime: 0.1040 s
ThenMemory: 10.403 Mb
TotalTime: 0.0640 s
In my opinion very good for necessitavano the context of the 13,000+documents.the
Issue 4. Bulk update of documents when changing system settings
I will not explain why, but I needed to change the suffix of the containers. Changed, and the Ajax response didn't wait... went to check the CPU /system/settings/updatefromgrid. Is there such a method checkForRefreshURIs(). In General, if you change "friendly_urls", "use_alias_path" or "container_suffix", then it signals that you need to update the URLs. All right. But the problem is that it tries to update all the documents, indiscriminately, not even containers. In addition, also condition sort by menuindex reason, puts (the order of nesting of interest, not the index menu).
In General, this process made the server cry. Added condition isfolder=1, and then 6 seconds updated all of the containers. To change the suffixes will not :-)
the
Summary
In practice, we got a full document processing on site with 13000+ documents (in two tables) and 43000+ TV-shack, less than 0.3 seconds with the cache updated. From the cache is less than 0.1 sec.
It is conditionally possible to consider that at this stage the difference between a large and a small site ends, so as to further brake is only possible on the level of rendering pages, and it depends on how we write templates, etc.
This time I will write in the next article (most likely tomorrow). But I must say that I'm going to do in Smarty, because IMHO to do all this on a clean chunks and snippets — a lot of problems.
And finally, the results from the local test 100 clients, 1000 requests each: gist.github.com/Fi1osof/462e1af10ab7b95311df
Time per request: 44.224 [ms] (mean, across all concurrent requests)
PS Package on modx.com: modx.com/extras/package/shopmodx
Project on Github: github.com/Fi1osof/shopModx
Flooded with the latest version of request-class.
P. P. S. you Specify the memcached provider better right config.core.php (just believe me).
$config_options = array (
'cache_handler' => 'cache.xPDOMemCached',
'cache_prefix' => 'shopmodx_', // Need to specify a different prefix for the different sites on the same memcached server
);
the Variable $config_options already there.
Комментарии
Отправить комментарий