Обсуждение: GSoC Proposal - Caching query results in pgpool-II
Hello<br /><br /> My name is Masanori Yamazaki. I am sending my proposal about <br />Google Summer Of Code2011. It wouldbe nice if you could give <br />me your opinion.<br /><br /><br />・title<br /><br />Caching query results in pgpool-II<br/><br /><br />・Synopsis<br /><br />Pgpool-II has query caching functionality using storage provided by<br />dedicated PostgreSQL ("system database"). This has several drawbacks<br />however. 1)it is slow because it needs to accessdisk storage 2)it<br />does not invalidate the cache automatically.<br /><br />This proposal tries to solve these problems.<br/><br />- To speed up the cache access, it will be placed on memory, rather<br /> than database. The memorywill be put on shared memory or external<br /> memory services such as memcached so that the cache can be shared by<br/> multiple sessions. Old cache entries will be deleted by LRU manner.<br /><br />- The cache will be invalidatedautomatically upon the timing when the<br /> relevant tables are updated. Note that this is not always possible<br/> because the query result might come from multiple tables, views or<br /> even functions. In this case thecache will be invalidated by<br /> timeout(or they are not cached at all).<br /><br />- Fine tuning knobs need to beinvented to control the cache behavior<br /> though they are not clear at this moment.<br /><br /><br />・Benefits tothe PostgreSQL Community:<br /><br /><br />Query caching will effectively enhance the performance of PostgreSQL<br />andthis project will contribute to increase the number of users of<br /> PostgreSQL, who need more high performance databasesystems.<br /><br />Note that implementing query cache in pgpool-II will bring merits not<br />only to the latestversion of PostgreSQL but to the previous releases<br />of PostgreSQL.<br /><br /><br />・Project Schedule<br /><br/>-April<br /> preparation <br /><br />-May 1 - May 22 <br /> write a specification<br /><br />-May 23 - June 19<br/> coding<br /><br />-June 20 - July 22<br /> test<br /><br />-July 23 - August 12<br /> complete of coding and test,commit<br /><br /><br />・Personal Data and Biographical Information<br /><br /> Name : Masanori Yamazaki<br /> Born: 23.1.1981<br /> School :Currently I learn contemporary philosophy, culture and literature at Waseda University inJapan.<br /> Coding :<br /> 1.About five years job as web application programer(PHP, Java).<br /> 2.I experienced projectsused framework such as Symfony, Zend Framework, CakePHP, and Struts.<br /> 3.I am interested in OSS and like coding.<br/><br /><br /> Regards<br />
<p>How does this relate to the existing pqc project ( <a href="http://code.google.com/p/pqc/">http://code.google.com/p/pqc/</a>)?Seems the goals are fairly similar, and both arebased off pgpool?<p>/Magnus<br /><div class="gmail_quote">On Apr 6, 2011 2:10 AM, "Masanori Yamazaki" <<a href="mailto:m.yamazaki23@gmail.com">m.yamazaki23@gmail.com</a>>wrote:<br type="attribution" />> Hello<br />> <br/>> My name is Masanori Yamazaki. I am sending my proposal about<br /> > Google Summer Of Code2011. It would benice if you could give<br />> me your opinion.<br />> <br />> <br />> ・title<br />> <br />> Caching queryresults in pgpool-II<br />> <br />> <br />> ・Synopsis<br />> <br /> > Pgpool-II has query caching functionalityusing storage provided by<br />> dedicated PostgreSQL ("system database"). This has several drawbacks<br/>> however. 1)it is slow because it needs to access disk storage 2)it<br /> > does not invalidate thecache automatically.<br />> <br />> This proposal tries to solve these problems.<br />> <br />> - To speedup the cache access, it will be placed on memory, rather<br />> than database. The memory will be put on shared memoryor external<br /> > memory services such as memcached so that the cache can be shared by<br />> multiple sessions.Old cache entries will be deleted by LRU manner.<br />> <br />> - The cache will be invalidated automaticallyupon the timing when the<br /> > relevant tables are updated. Note that this is not always possible<br />>because the query result might come from multiple tables, views or<br />> even functions. In this case the cachewill be invalidated by<br /> > timeout(or they are not cached at all).<br />> <br />> - Fine tuning knobs needto be invented to control the cache behavior<br />> though they are not clear at this moment.<br />> <br />><br />> ・Benefits to the PostgreSQL Community:<br /> > <br />> <br />> Query caching will effectively enhancethe performance of PostgreSQL<br />> and this project will contribute to increase the number of users of<br />>PostgreSQL, who need more high performance database systems.<br /> > <br />> Note that implementing query cachein pgpool-II will bring merits not<br />> only to the latest version of PostgreSQL but to the previous releases<br/>> of PostgreSQL.<br />> <br />> <br />> ・Project Schedule<br /> > <br />> -April<br />>preparation<br />> <br />> -May 1 - May 22<br />> write a specification<br />> <br />> -May 23 - June19<br />> coding<br />> <br />> -June 20 - July 22<br />> test<br />> <br />> -July 23 - August 12<br/> > complete of coding and test, commit<br />> <br />> <br />> ・Personal Data and Biographical Information<br/>> <br />> Name : Masanori Yamazaki<br />> Born : 23.1.1981<br />> School :Currently I learn contemporaryphilosophy, culture and literature<br /> > at Waseda University in Japan.<br />> Coding :<br />> 1.Aboutfive years job as web application programer(PHP, Java).<br />> 2.I experienced projects used framework such asSymfony, Zend Framework,<br />> CakePHP, and Struts.<br /> > 3.I am interested in OSS and like coding.<br />><br />> <br />> Regards<br /></div>
In my understanding pqc is not designed to be working with pgpool. Thus if a user want to use both query cache and query dispatching, replication or failover etc. which are provided by pgpool, it seems it's not possible. For this purpose maybe user could *cascade* pqc and pgpool, but I'm not sure. Even if it's possible, it will bring huge performance penalty. Another point is cache invalidation. Masanori's proposal includes cache invalidation technique by looking at write queries, which is lacking in pqc in my understanding. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp > How does this relate to the existing pqc project ( > http://code.google.com/p/pqc/)? Seems the goals are fairly similar, and both > are based off pgpool? > > /Magnus > On Apr 6, 2011 2:10 AM, "Masanori Yamazaki" <m.yamazaki23@gmail.com> wrote: >> Hello >> >> My name is Masanori Yamazaki. I am sending my proposal about >> Google Summer Of Code2011. It would be nice if you could give >> me your opinion. >> >> >> ・title >> >> Caching query results in pgpool-II >> >> >> ・Synopsis >> >> Pgpool-II has query caching functionality using storage provided by >> dedicated PostgreSQL ("system database"). This has several drawbacks >> however. 1)it is slow because it needs to access disk storage 2)it >> does not invalidate the cache automatically. >> >> This proposal tries to solve these problems. >> >> - To speed up the cache access, it will be placed on memory, rather >> than database. The memory will be put on shared memory or external >> memory services such as memcached so that the cache can be shared by >> multiple sessions. Old cache entries will be deleted by LRU manner. >> >> - The cache will be invalidated automatically upon the timing when the >> relevant tables are updated. Note that this is not always possible >> because the query result might come from multiple tables, views or >> even functions. In this case the cache will be invalidated by >> timeout(or they are not cached at all). >> >> - Fine tuning knobs need to be invented to control the cache behavior >> though they are not clear at this moment. >> >> >> ・Benefits to the PostgreSQL Community: >> >> >> Query caching will effectively enhance the performance of PostgreSQL >> and this project will contribute to increase the number of users of >> PostgreSQL, who need more high performance database systems. >> >> Note that implementing query cache in pgpool-II will bring merits not >> only to the latest version of PostgreSQL but to the previous releases >> of PostgreSQL. >> >> >> ・Project Schedule >> >> -April >> preparation >> >> -May 1 - May 22 >> write a specification >> >> -May 23 - June 19 >> coding >> >> -June 20 - July 22 >> test >> >> -July 23 - August 12 >> complete of coding and test, commit >> >> >> ・Personal Data and Biographical Information >> >> Name : Masanori Yamazaki >> Born : 23.1.1981 >> School :Currently I learn contemporary philosophy, culture and literature >> at Waseda University in Japan. >> Coding : >> 1.About five years job as web application programer(PHP, Java). >> 2.I experienced projects used framework such as Symfony, Zend Framework, >> CakePHP, and Struts. >> 3.I am interested in OSS and like coding. >> >> >> Regards
2011/4/7 Tatsuo Ishii <ishii@postgresql.org>: > In my understanding pqc is not designed to be working with pgpool. > Thus if a user want to use both query cache and query dispatching, > replication or failover etc. which are provided by pgpool, it seems > it's not possible. For this purpose maybe user could *cascade* pqc and > pgpool, but I'm not sure. Even if it's possible, it will bring huge > performance penalty. > > Another point is cache invalidation. Masanori's proposal includes > cache invalidation technique by looking at write queries, which is > lacking in pqc in my understanding. Probably. My question wasn't necessarily "hasn't this already been done in pqc", more "should this perhaps build on or integrate with pgc in order not to duplicate effort". I think at the very least, any overlap should be researched and identified - because if it can integrate parts of pgc, or work with, more effort can be spent on the new parts rather than redoing something that's already been done. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/