Обсуждение: A query planner that learns
While all the talk of a hinting system over in hackers and perform is good, and I have a few queries that could live with a simple hint system pop up now and again, I keep thinking that a query planner that learns from its mistakes over time is far more desirable. Is it reasonable or possible for the system to have a way to look at query plans it's run and look for obvious mistakes its made, like being off by a factor of 10 or more in estimations, and slowly learn to apply its own hints? Seems to me that would be far more useful than my having to babysit the queries that are running slow and come up with hints to have the database do what I want. I already log slow queries and review them once a week by running them with explain analyze and adjust what little I can, like stats targets and such. It seems to me the first logical step would be having the ability to flip a switch and when the postmaster hits a slow query, it saves both the query that ran long, as well as the output of explain or explain analyze or some bastardized version missing some of the inner timing info. Even just saving the parts of the plan where the planner thought it would get 1 row and got instead 350,000 and was using a nested loop to join would be VERY useful. I could see something like that eventually evolving into a self tuning system. Well, I'm busy learning to be an Oracle DBA right now, so I can't do it. But it would be a very cool project for the next college student who shows up looking for one.
On Thu, Oct 12, 2006 at 03:31:50PM -0500, Scott Marlowe wrote: > While all the talk of a hinting system over in hackers and perform is > good, and I have a few queries that could live with a simple hint system > pop up now and again, I keep thinking that a query planner that learns > from its mistakes over time is far more desirable. > > Is it reasonable or possible for the system to have a way to look at > query plans it's run and look for obvious mistakes its made, like being > off by a factor of 10 or more in estimations, and slowly learn to apply > its own hints? > > Seems to me that would be far more useful than my having to babysit the > queries that are running slow and come up with hints to have the > database do what I want. > > I already log slow queries and review them once a week by running them > with explain analyze and adjust what little I can, like stats targets > and such. > > It seems to me the first logical step would be having the ability to > flip a switch and when the postmaster hits a slow query, it saves both > the query that ran long, as well as the output of explain or explain > analyze or some bastardized version missing some of the inner timing > info. Even just saving the parts of the plan where the planner thought > it would get 1 row and got instead 350,000 and was using a nested loop > to join would be VERY useful. I could see something like that > eventually evolving into a self tuning system. Saves it and then... does what? That's the whole key... > Well, I'm busy learning to be an Oracle DBA right now, so I can't do > it. But it would be a very cool project for the next college student > who shows up looking for one. Why? There's a huge demand for PostgreSQL experts out there... or is this for a current job? -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Thu, 2006-10-12 at 17:14, Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 03:31:50PM -0500, Scott Marlowe wrote: > > While all the talk of a hinting system over in hackers and perform is > > good, and I have a few queries that could live with a simple hint system > > pop up now and again, I keep thinking that a query planner that learns > > from its mistakes over time is far more desirable. > > > > Is it reasonable or possible for the system to have a way to look at > > query plans it's run and look for obvious mistakes its made, like being > > off by a factor of 10 or more in estimations, and slowly learn to apply > > its own hints? > > > > Seems to me that would be far more useful than my having to babysit the > > queries that are running slow and come up with hints to have the > > database do what I want. > > > > I already log slow queries and review them once a week by running them > > with explain analyze and adjust what little I can, like stats targets > > and such. > > > > It seems to me the first logical step would be having the ability to > > flip a switch and when the postmaster hits a slow query, it saves both > > the query that ran long, as well as the output of explain or explain > > analyze or some bastardized version missing some of the inner timing > > info. Even just saving the parts of the plan where the planner thought > > it would get 1 row and got instead 350,000 and was using a nested loop > > to join would be VERY useful. I could see something like that > > eventually evolving into a self tuning system. > > Saves it and then... does what? That's the whole key... It's meant as a first step. I could certainly use a daily report on which queries had bad plans so I'd know which ones to investigate without having to run them each myself in explain analyze. Again, my point was to do it incrementally. This is something someone could do now, and someone could build on later. To start with, it does nothing. Just saves it for the DBA to look at. Later, it could feed any number of the different hinting systems people have been proposing. It may well be that by first looking at the data collected from problems queries, the solution for how to adjust the planner becomes more obvious. > > Well, I'm busy learning to be an Oracle DBA right now, so I can't do > > it. But it would be a very cool project for the next college student > > who shows up looking for one. > > Why? There's a huge demand for PostgreSQL experts out there... or is > this for a current job? Long story. I do get to do lots of pgsql stuff. But right now I'm learning Oracle as well, cause we use both DBs. It's just that since I know pgsql pretty well, and know oracle hardly at all, Oracle is taking up lots more of my time.
Forgive me if I'm way off here as I'm not all that familiar with the internals of postgres, but isn't this what the genetic query optimizer discussed the one of the manual's appendixes is supposed to do. Or, is that more about using the the known costs of atomic operations that make up a query plan, i.e. more of a bottom up approach than what you're discussing. If it's the latter, then it sounds like what you're looking for is a classifier system. See <a href="http://en.wikipedia.org/wiki/Learning_classifier_system">this</a> wikepedia article for a short description of them along with a couple of reference links, you can google for many more. Scott Marlowe wrote: > On Thu, 2006-10-12 at 17:14, Jim C. Nasby wrote: > >> On Thu, Oct 12, 2006 at 03:31:50PM -0500, Scott Marlowe wrote: >> >>> While all the talk of a hinting system over in hackers and perform is >>> good, and I have a few queries that could live with a simple hint system >>> pop up now and again, I keep thinking that a query planner that learns >>> from its mistakes over time is far more desirable. >>> >>> Is it reasonable or possible for the system to have a way to look at >>> query plans it's run and look for obvious mistakes its made, like being >>> off by a factor of 10 or more in estimations, and slowly learn to apply >>> its own hints? >>> >>> Seems to me that would be far more useful than my having to babysit the >>> queries that are running slow and come up with hints to have the >>> database do what I want. >>> >>> I already log slow queries and review them once a week by running them >>> with explain analyze and adjust what little I can, like stats targets >>> and such. >>> >>> It seems to me the first logical step would be having the ability to >>> flip a switch and when the postmaster hits a slow query, it saves both >>> the query that ran long, as well as the output of explain or explain >>> analyze or some bastardized version missing some of the inner timing >>> info. Even just saving the parts of the plan where the planner thought >>> it would get 1 row and got instead 350,000 and was using a nested loop >>> to join would be VERY useful. I could see something like that >>> eventually evolving into a self tuning system. >>> >> >> Saves it and then... does what? That's the whole key... >> > > It's meant as a first step. I could certainly use a daily report on > which queries had bad plans so I'd know which ones to investigate > without having to run them each myself in explain analyze. Again, my > point was to do it incrementally. This is something someone could do > now, and someone could build on later. > > To start with, it does nothing. Just saves it for the DBA to look at. > Later, it could feed any number of the different hinting systems people > have been proposing. > > It may well be that by first looking at the data collected from problems > queries, the solution for how to adjust the planner becomes more > obvious. > > >>> Well, I'm busy learning to be an Oracle DBA right now, so I can't do >>> it. But it would be a very cool project for the next college student >>> who shows up looking for one. >>> >> Why? There's a huge demand for PostgreSQL experts out there... or is >> this for a current job? >> > > Long story. I do get to do lots of pgsql stuff. But right now I'm > learning Oracle as well, cause we use both DBs. It's just that since I > know pgsql pretty well, and know oracle hardly at all, Oracle is taking > up lots more of my time. > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- erik jones <erik@myemma.com> software development emma(r)
Erik Jones wrote: > Forgive me if I'm way off here as I'm not all that familiar with > the internals of postgres, but isn't this what the genetic query > optimizer discussed the one of the manual's appendixes is supposed > to do. No - it's not an "optimizer" in that sense. When there are a small enough set of tables involved, the planner uses a dynamic programming algorithm to explore the entire space of all possible plans. But the space grows exponentially (I think) with the number of tables - when this would take too long, the planner switches to a genetic algorithm approach, which explores a small fraction of the plan space, in a guided manner. But with both approaches, the planner is just using the static statistics gathered by ANALYZE to estimate the cost of each candidate plan, and these statistics are based on sampling your data - they may be wrong, or at least misleading. (In particular, the statistic for total number of unique values is frequently =way= off, per a recent thread here. I have been reading about this, idly thinking about how to improve the estimate.) The idea of a learning planner, I suppose, would be one that examines cases where these statistics lead to very misguided expectations. The simplest version of a "learning" planner could simply bump up the statistics targets on certain columns. A slightly more sophisticated idea would be for some of the statistics to optionally use parametric modeling (this column is a Gaussian, let's estimate the mean and variance, this one is a Beta distribution ...). Then the smarter planner could spend some cycles applying more sophisticated statistical modeling to problematic tables/columns. - John D. Burger MITRE
On Oct 13, 2006, at 11:47 , John D. Burger wrote: > Erik Jones wrote: > >> Forgive me if I'm way off here as I'm not all that familiar with >> the internals of postgres, but isn't this what the genetic query >> optimizer discussed the one of the manual's appendixes is supposed >> to do. > > No - it's not an "optimizer" in that sense. When there are a small > enough set of tables involved, the planner uses a dynamic > programming algorithm to explore the entire space of all possible > plans. But the space grows exponentially (I think) with the number > of tables - when this would take too long, the planner switches to > a genetic algorithm approach, which explores a small fraction of > the plan space, in a guided manner. > > But with both approaches, the planner is just using the static > statistics gathered by ANALYZE to estimate the cost of each > candidate plan, and these statistics are based on sampling your > data - they may be wrong, or at least misleading. (In particular, > the statistic for total number of unique values is frequently =way= > off, per a recent thread here. I have been reading about this, > idly thinking about how to improve the estimate.) > > The idea of a learning planner, I suppose, would be one that > examines cases where these statistics lead to very misguided > expectations. The simplest version of a "learning" planner could > simply bump up the statistics targets on certain columns. A > slightly more sophisticated idea would be for some of the > statistics to optionally use parametric modeling (this column is a > Gaussian, let's estimate the mean and variance, this one is a Beta > distribution ...). Then the smarter planner could spend some > cycles applying more sophisticated statistical modeling to > problematic tables/columns. One simple first step would be to run an ANALYZE whenever a sequential scan is executed. Is there a reason not to do this? It could be controlled by a GUC variable in case someone wants repeatable plans. Further down the line, statistics could be collected during the execution of any query- updating histograms on delete and update, as well. -M
On Thu, Oct 12, 2006 at 05:39:20PM -0500, Scott Marlowe wrote: > > > It seems to me the first logical step would be having the ability to > > > flip a switch and when the postmaster hits a slow query, it saves both > > > the query that ran long, as well as the output of explain or explain > > > analyze or some bastardized version missing some of the inner timing > > > info. Even just saving the parts of the plan where the planner thought > > > it would get 1 row and got instead 350,000 and was using a nested loop > > > to join would be VERY useful. I could see something like that > > > eventually evolving into a self tuning system. > > > > Saves it and then... does what? That's the whole key... > > It's meant as a first step. I could certainly use a daily report on > which queries had bad plans so I'd know which ones to investigate > without having to run them each myself in explain analyze. Again, my > point was to do it incrementally. This is something someone could do > now, and someone could build on later. > > To start with, it does nothing. Just saves it for the DBA to look at. > Later, it could feed any number of the different hinting systems people > have been proposing. > > It may well be that by first looking at the data collected from problems > queries, the solution for how to adjust the planner becomes more > obvious. Yeah, that would be useful to have. The problem I see is storing that info in a format that's actually useful... and I'm thinking that a logfile doesn't qualify since you can't really query it. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Fri, Oct 13, 2006 at 11:53:15AM -0400, AgentM wrote: > One simple first step would be to run an ANALYZE whenever a > sequential scan is executed. Is there a reason not to do this? It Yes. You want a seqscan on a small (couple pages) table, and ANALYZE has a very high overhead on some platforms. Just recording the query plan and actual vs estimated rowcounts would be a good start, though. And useful to DBA's, provided you had some means to query against it. -- Jim Nasby jim@nasby.net EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
On Fri, 2006-10-13 at 12:48, Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 05:39:20PM -0500, Scott Marlowe wrote: > > > > It seems to me the first logical step would be having the ability to > > > > flip a switch and when the postmaster hits a slow query, it saves both > > > > the query that ran long, as well as the output of explain or explain > > > > analyze or some bastardized version missing some of the inner timing > > > > info. Even just saving the parts of the plan where the planner thought > > > > it would get 1 row and got instead 350,000 and was using a nested loop > > > > to join would be VERY useful. I could see something like that > > > > eventually evolving into a self tuning system. > > > > > > Saves it and then... does what? That's the whole key... > > > > It's meant as a first step. I could certainly use a daily report on > > which queries had bad plans so I'd know which ones to investigate > > without having to run them each myself in explain analyze. Again, my > > point was to do it incrementally. This is something someone could do > > now, and someone could build on later. > > > > To start with, it does nothing. Just saves it for the DBA to look at. > > Later, it could feed any number of the different hinting systems people > > have been proposing. > > > > It may well be that by first looking at the data collected from problems > > queries, the solution for how to adjust the planner becomes more > > obvious. > > Yeah, that would be useful to have. The problem I see is storing that > info in a format that's actually useful... and I'm thinking that a > logfile doesn't qualify since you can't really query it. grep / sed / awk can do amazing things to a text file. I'd actually recommend URL encoding (or something like that) so they'd be single lines, then you could grep for certain things and feed the lines to a simple de-encoder. We do it with our log files at work and can search through some fairly large files for the exact entry we need fairly quickly.
Jim C. Nasby wrote: > Just recording the query plan and actual vs estimated rowcounts would > be a good start, though. And useful to DBA's, provided you had some > means to query against it. If the DBMS stores the incoming query (or some parsed version of it) and identifies frequency of usage over time, it can then spend spare cycles more deeply analyzing frequently used queries. Many DB servers have usage patterns just like end-user workstations (not all, I realize): they are busy for predictable periods of the day and fairly idle at other times. To provide acceptable response times, DBMSs don't have the luxury of analyzing numerous query paths when queries are received. But they could do this offline so that the next time it sees the same query, it can use a better-optimized plan. Storing the query itself is probably a better idea than storing the plan, since the plan may change over time. -- Guy Rouillier
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 10/13/06 10:47, John D. Burger wrote: > Erik Jones wrote: > [snip] > But with both approaches, the planner is just using the static > statistics gathered by ANALYZE to estimate the cost of each candidate > plan, and these statistics are based on sampling your data - they may be > wrong, or at least misleading. (In particular, the statistic for total > number of unique values is frequently =way= off, per a recent thread > here. I have been reading about this, idly thinking about how to > improve the estimate.) What about an ANALYZE FULL <table>, reading every record in the table, and ever node in every index, storing in pg_statistic (or some new, similar table) such items as the AVG and STD of the number of records in each page, and b-tree depth, keys per node, records per key and per segment? Maybe even "average distance between pages in the tablespace". This would let the optimizer know things like "the value of the field which is first segment of an index (and which is the only part of the index in the WHERE clause) describes 75% of the rows in the table, and the records are all packed in tightly in the pages, and the pages are close together", so the optimizer could decide "a table scan would be much more efficient". In some ways, this would be similar in functionality to the existing histogram created by ANALYZE, but would provide a slightly different picture. - -- Ron Johnson, Jr. Jefferson LA USA Is "common sense" really valid? For example, it is "common sense" to white-power racists that whites are superior to blacks, and that those with brown skins are mud people. However, that "common sense" is obviously wrong. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (GNU/Linux) iD8DBQFFMcu5S9HxQb37XmcRAvVJAJ0VFfEoxwrKn15VqPaZz54SNY4tPACg47zB r3hZ+HqHE/1bCJK/xNZzNRE= =OP9+ -----END PGP SIGNATURE-----
Jim C. Nasby wrote: > On Thu, Oct 12, 2006 at 05:39:20PM -0500, Scott Marlowe wrote: >> It may well be that by first looking at the data collected from problems >> queries, the solution for how to adjust the planner becomes more >> obvious. > > Yeah, that would be useful to have. The problem I see is storing that > info in a format that's actually useful... and I'm thinking that a > logfile doesn't qualify since you can't really query it. You'd need something that contains the query plan (obviously), but also all relevant information about the underlying data model and data distribution. Some meta-data, like the version of PostgreSQL, is probably required as well. The current statistics contain some of this information, but from reading this list I know that that's rarely enough information to determine an error made by the planner. Regards, -- Alban Hertroys alban@magproductions.nl magproductions b.v. T: ++31(0)534346874 F: ++31(0)534346876 M: I: www.magproductions.nl A: Postbus 416 7500 AK Enschede // Integrate Your World //
Scott Marlowe wrote: > While all the talk of a hinting system over in hackers and perform is > good, and I have a few queries that could live with a simple hint system > pop up now and again, I keep thinking that a query planner that learns > from its mistakes over time is far more desirable. > > Is it reasonable or possible for the system to have a way to look at > query plans it's run and look for obvious mistakes its made, like being > off by a factor of 10 or more in estimations, and slowly learn to apply > its own hints? Technically it is very feasible. But I think you might want to check US Patent 6,763,359 before you start writing any code. > It seems to me the first logical step would be having the ability to > flip a switch and when the postmaster hits a slow query, it saves both > the query that ran long, as well as the output of explain or explain > analyze or some bastardized version missing some of the inner timing > info. Even just saving the parts of the plan where the planner thought > it would get 1 row and got instead 350,000 and was using a nested loop > to join would be VERY useful. I could see something like that > eventually evolving into a self tuning system. I think it would be a good start if we can specify a log_selectivity_error_threshold and if estimates are more then that factor off, the query, parameters and planner estimates get logged for later analysis. That would be driven entirely by selectivity estimates and not (estimated) cost since cost is influenced by outside factors such as other processes competing for resources. If a system for statistical hints emerges from the current discussion we would indeed have the input to start tuning the selectivity estimations. Jochem
Jochem van Dieten wrote: > Scott Marlowe wrote: > >While all the talk of a hinting system over in hackers and perform is > >good, and I have a few queries that could live with a simple hint system > >pop up now and again, I keep thinking that a query planner that learns > >from its mistakes over time is far more desirable. > > > >Is it reasonable or possible for the system to have a way to look at > >query plans it's run and look for obvious mistakes its made, like being > >off by a factor of 10 or more in estimations, and slowly learn to apply > >its own hints? > > Technically it is very feasible. But I think you might want to check US > Patent 6,763,359 before you start writing any code. I think it would be a very good idea if you guys stopped looking at the US patent database. It does no good to anyone. There's no way we can avoid stomping on a patent or another -- there are patents for everything. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Jochem van Dieten wrote: > I think you might want to check US Patent 6,763,359 before you > start writing any code. http://tinyurl.com/yzjdve - John D. Burger MITRE
Alvaro Herrera wrote: > Jochem van Dieten wrote: >> Scott Marlowe wrote: >>> While all the talk of a hinting system over in hackers and perform is >>> good, and I have a few queries that could live with a simple hint system >>> pop up now and again, I keep thinking that a query planner that learns >> >from its mistakes over time is far more desirable. >>> Is it reasonable or possible for the system to have a way to look at >>> query plans it's run and look for obvious mistakes its made, like being >>> off by a factor of 10 or more in estimations, and slowly learn to apply >>> its own hints? >> Technically it is very feasible. But I think you might want to check US >> Patent 6,763,359 before you start writing any code. > > I think it would be a very good idea if you guys stopped looking at the > US patent database. It does no good to anyone. There's no way we can > avoid stomping on a patent or another -- there are patents for everything. > Hasn't IBM release a pile of it's patents for use (or at least stated they won't sue) to OSS projects? If so, is this patent covered by that "amnesty"? Simply ignoring patents because "there is a patent for everything" is a recipe for disaster. Companies like MS are running out of ways to tear open OSS and they are certainly not above (below?) suing the heck out of OSS projects for patent infringement. What's needed is reform in the USPO. Call you congress (wo)man and complain, but don't flaunt the law; you will lose. Madi
Madison Kelly wrote: > Alvaro Herrera wrote: > >Jochem van Dieten wrote: > >>Scott Marlowe wrote: > >>>While all the talk of a hinting system over in hackers and perform is > >>>good, and I have a few queries that could live with a simple hint system > >>>pop up now and again, I keep thinking that a query planner that learns > >>>from its mistakes over time is far more desirable. > >>>Is it reasonable or possible for the system to have a way to look at > >>>query plans it's run and look for obvious mistakes its made, like being > >>>off by a factor of 10 or more in estimations, and slowly learn to apply > >>>its own hints? > >>Technically it is very feasible. But I think you might want to check US > >>Patent 6,763,359 before you start writing any code. > > > >I think it would be a very good idea if you guys stopped looking at the > >US patent database. It does no good to anyone. There's no way we can > >avoid stomping on a patent or another -- there are patents for everything. > > Hasn't IBM release a pile of it's patents for use (or at least stated > they won't sue) to OSS projects? If so, is this patent covered by that > "amnesty"? This is useless as a policy, because we have plenty of companies basing their proprietary code on PostgreSQL, which wouldn't be subject to the grant (EnterpriseDB, Command Prompt, Fujitsu, SRA). We do support them. > Simply ignoring patents because "there is a patent for everything" is a > recipe for disaster. Companies like MS are running out of ways to tear > open OSS and they are certainly not above (below?) suing the heck out of > OSS projects for patent infringement. It has been said that unknowingly infringing a patent is much less problematic than knowingly doing same. We don't have the manpower to implement the whole Postgres without infringing a single patent, so the best approach is to refrain from researching possible patents applying to us in the first place. If people comes here and points at patents that we infringe or may infringe, it will cause much more (useless) work for hackers which then have to search alternative ways of doing the same things. > What's needed is reform in the USPO. Call you congress (wo)man and > complain, but don't flaunt the law; you will lose. I agree. However, I am not an US inhabitant in the first place, and bless my parents for that. Heck, I was even denied a visa -- twice. Please do us all a favor and write to your congresspeople. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Alvaro Herrera wrote: >> Hasn't IBM release a pile of it's patents for use (or at least stated >> they won't sue) to OSS projects? If so, is this patent covered by that >> "amnesty"? > > This is useless as a policy, because we have plenty of companies basing > their proprietary code on PostgreSQL, which wouldn't be subject to the > grant (EnterpriseDB, Command Prompt, Fujitsu, SRA). We do support them. More specifically, it is then up to the 3rd party (non-OSS) developers to clear the patents. It's not the PgSQL team's problem in this case (assuming it's the case at all). >> Simply ignoring patents because "there is a patent for everything" is a >> recipe for disaster. Companies like MS are running out of ways to tear >> open OSS and they are certainly not above (below?) suing the heck out of >> OSS projects for patent infringement. > > It has been said that unknowingly infringing a patent is much less > problematic than knowingly doing same. We don't have the manpower to > implement the whole Postgres without infringing a single patent, so the > best approach is to refrain from researching possible patents applying > to us in the first place. > > If people comes here and points at patents that we infringe or may > infringe, it will cause much more (useless) work for hackers which then > have to search alternative ways of doing the same things. "Plausible Deniability" and all that jazz. There is another truism though; "Ignorance of the law is no excuse". Besides, claiming ignorance doesn't keep you out of the courts in the first place. The people who would attack OSS applications generally have very, very deep pockets and can run a project out of money before the trial was over. They could do that non-the-less (SCO, hello?) but I still suggest NOT tempting fate. It's unfortunate that this legal mess causes the developers more headaches than they need, but it still can't be ignored, imho. >> What's needed is reform in the USPO. Call you congress (wo)man and >> complain, but don't flaunt the law; you will lose. > > I agree. However, I am not an US inhabitant in the first place, and > bless my parents for that. Heck, I was even denied a visa -- twice. > Please do us all a favor and write to your congresspeople. Heh, I'm not an American either, so I can't do anything but shake my head (and be equally glad that my own personal OSS program is here in Canada). American industry wonders why they are losing so many IT positions... It's such a difficult and unfriendly environment there for anyone but the biggest companies. Far too litigious. Madi
On Oct 16, 2006, at 16:17 , Madison Kelly wrote: > Alvaro Herrera wrote: >> Jochem van Dieten wrote: >>> Scott Marlowe wrote: >>>> While all the talk of a hinting system over in hackers and >>>> perform is >>>> good, and I have a few queries that could live with a simple >>>> hint system >>>> pop up now and again, I keep thinking that a query planner that >>>> learns >>> >from its mistakes over time is far more desirable. >>>> Is it reasonable or possible for the system to have a way to >>>> look at >>>> query plans it's run and look for obvious mistakes its made, >>>> like being >>>> off by a factor of 10 or more in estimations, and slowly learn >>>> to apply >>>> its own hints? >>> Technically it is very feasible. But I think you might want to >>> check US Patent 6,763,359 before you start writing any code. >> I think it would be a very good idea if you guys stopped looking >> at the >> US patent database. It does no good to anyone. There's no way we >> can >> avoid stomping on a patent or another -- there are patents for >> everything. > > Hasn't IBM release a pile of it's patents for use (or at least > stated they won't sue) to OSS projects? If so, is this patent > covered by that "amnesty"? > > Simply ignoring patents because "there is a patent for everything" > is a recipe for disaster. Companies like MS are running out of ways > to tear open OSS and they are certainly not above (below?) suing > the heck out of OSS projects for patent infringement. > > What's needed is reform in the USPO. Call you congress (wo)man and > complain, but don't flaunt the law; you will lose. Alvaro's advice is sound. If the patent holder can prove that a developer looked at a patent (for example, from an email in a mailing list archive) and the project proceeded with the implementation regardless, malice can been shown and "damages" can be substantially higher. You're screwed either way but your safest bet is to never look at patents. Disclaimer: I am not a lawyer- I don't even like lawyers. -M
AgentM wrote: > Alvaro's advice is sound. If the patent holder can prove that a > developer looked at a patent (for example, from an email in a mailing > list archive) and the project proceeded with the implementation > regardless, malice can been shown and "damages" can be substantially > higher. You're screwed either way but your safest bet is to never look > at patents. > > Disclaimer: I am not a lawyer- I don't even like lawyers. Nor am I a lawyer, but I still hold that hoping "ignorance" will be a decent defense is very, very risky. In the end I am not a pgSQL developer so it isn't in my hands either way. Madi
On 10/17/06, Madison Kelly <linux@alteeve.com> wrote:
I also am NAL, but I know enough about the patent system (in the US) to know that ignorance *IS* a defense. If you are ignorant of the patent, you only have to pay the damages. If you knew about the patent and did it anyway, you have to pay *triple* damages. Ignorance will save you lots of money.
You may not like it, but that's the way it is.
AgentM wrote:
> Alvaro's advice is sound. If the patent holder can prove that a
> developer looked at a patent (for example, from an email in a mailing
> list archive) and the project proceeded with the implementation
> regardless, malice can been shown and "damages" can be substantially
> higher. You're screwed either way but your safest bet is to never look
> at patents.
>
> Disclaimer: I am not a lawyer- I don't even like lawyers.
Nor am I a lawyer, but I still hold that hoping "ignorance" will be a
decent defense is very, very risky. In the end I am not a pgSQL
developer so it isn't in my hands either way.
Madi
I also am NAL, but I know enough about the patent system (in the US) to know that ignorance *IS* a defense. If you are ignorant of the patent, you only have to pay the damages. If you knew about the patent and did it anyway, you have to pay *triple* damages. Ignorance will save you lots of money.
You may not like it, but that's the way it is.
Brian Mathis wrote: > I also am NAL, but I know enough about the patent system (in the US) to > know that ignorance *IS* a defense. If you are ignorant of the patent, > you only have to pay the damages. If you knew about the patent and did > it anyway, you have to pay *triple* damages. Ignorance will save you > lots of money. > > You may not like it, but that's the way it is. > I got that part. :) If you _do_ end up in court, plausible deniability helps. My position though is that it is better, in the long term, to be aware of the patents and take the time to work around them so that *no* damages need to be paid. Or, as might be that chance in this case, to get a written "okay" from the patent holder for the use of the methods protected by the patent in a given program. Colour me funny, but wouldn't staying out of the courts in the first place not be the best option? Madi
On 10/17/06, Madison Kelly <linux@alteeve.com> wrote:
Yes, good idea :)
Brian Mathis wrote:
> I also am NAL, but I know enough about the patent system (in the US) to
> know that ignorance *IS* a defense. If you are ignorant of the patent,
> you only have to pay the damages. If you knew about the patent and did
> it anyway, you have to pay *triple* damages. Ignorance will save you
> lots of money.
>
> You may not like it, but that's the way it is.
>
I got that part. :) If you _do_ end up in court, plausible deniability
helps.
My position though is that it is better, in the long term, to be aware
of the patents and take the time to work around them so that *no*
damages need to be paid. Or, as might be that chance in this case, to
get a written "okay" from the patent holder for the use of the methods
protected by the patent in a given program.
Colour me funny, but wouldn't staying out of the courts in the first
place not be the best option?
Madi
Yes, good idea :)
> Colour me funny, but wouldn't staying out of the courts in the first > place not be the best option? Yes, however some people feel that given the way the patent office is spewing huge quantities of patents, many on old well-known techniques, and the the absurd difficulty of reading patent claims, and the deliberately vague language used in the hope of broadening the scope later, that it is actually not possible to keep track of them, therefore not possible to avoid infringement with any certainty. So, better to fall back to the 2nd-best position... It's a difficult question to answer because there are 0 good answers. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
Madison Kelly wrote: > Brian Mathis wrote: > >I also am NAL, but I know enough about the patent system (in the US) to > >know that ignorance *IS* a defense. If you are ignorant of the patent, > >you only have to pay the damages. If you knew about the patent and did > >it anyway, you have to pay *triple* damages. Ignorance will save you > >lots of money. > > > >You may not like it, but that's the way it is. > > > > I got that part. :) If you _do_ end up in court, plausible deniability > helps. > > My position though is that it is better, in the long term, to be aware > of the patents and take the time to work around them so that *no* > damages need to be paid. Or, as might be that chance in this case, to > get a written "okay" from the patent holder for the use of the methods > protected by the patent in a given program. > > Colour me funny, but wouldn't staying out of the courts in the first > place not be the best option? Yeah. I invite you to do all the extra (useless) development work required. But please do not charge other people with it. Whoever investigates patents and lets pgsql-hackers know about them, is charging the Postgres community with that work. We sure don't need it. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Alvaro Herrera wrote: > Yeah. I invite you to do all the extra (useless) development work > required. But please do not charge other people with it. Whoever > investigates patents and lets pgsql-hackers know about them, is charging > the Postgres community with that work. We sure don't need it. As a developer (different project that uses pgsql), I am no more happy about the current status of the USPO than you are. I think it's a giant cess-pool of greed, ignorance and waste... BUT... it is currently the law in the USA. How fast would Oracle, Microsoft or others jump on a chance to tear PostgreSQL apart by dragging it through the courts and crushing it under any fines (regardless of the amount)? If that happened, *all* of the pgsql-hacker's time would be wasted, never mind the countless other projects that rely on PgSQL. As Scott said, there are 0 good answers to this problem. If PgSQL is going to be active in the US then it has to play by the (bent and twisted) rules. That is why I started my comments with "call your congress(wo)man". Simply ignoring that which you don't like is just far to risky with the sharks in our waters. It's extra work, and it's unfair to the developers who already put their free time into this program, but IMHO it's the only safe way to go. Again though, not being a pgsql-hacker my comments here are just that. Madi
On Oct 17, 2006, at 10:46 , Madison Kelly wrote: > Brian Mathis wrote: >> I also am NAL, but I know enough about the patent system (in the >> US) to know that ignorance *IS* a defense. If you are ignorant of >> the patent, you only have to pay the damages. If you knew about >> the patent and did it anyway, you have to pay *triple* damages. >> Ignorance will save you lots of money. >> You may not like it, but that's the way it is. > > I got that part. :) If you _do_ end up in court, plausible > deniability helps. > > My position though is that it is better, in the long term, to be > aware of the patents and take the time to work around them so that > *no* damages need to be paid. Or, as might be that chance in this > case, to get a written "okay" from the patent holder for the use of > the methods protected by the patent in a given program. > > Colour me funny, but wouldn't staying out of the courts in the > first place not be the best option? That would be a nice, but naïve, approach. It is likely that (without admitting any guilt by any party) postgresql already steps on some patents. In fact, any project you can think of likely steps on some patents. There are patents on network communication: the "getting a message from a server to client" sort of thing. If you spent the next twenty years searching through patents and creating patches for postgresql to circumvent the patents, not only would you turn postgresql into a shriveled raisin of its current self, you would be exposing postgresql to greater damages than if you had never looked at the patents. ***Please- for the safety of everyone on these lists- do not mention or link to any specific patents. This mailing list has a publicly- accessible archive which could be used against me or you (at least in the USA).*** Best regards, M
On Tue, 2006-10-17 at 10:24 -0400, Madison Kelly wrote: > Nor am I a lawyer, but I still hold that hoping "ignorance" will be a > decent defense is very, very risky. In the end I am not a pgSQL > developer so it isn't in my hands either way. If I may. The "hoping, ignorance will save you" line of defence is in fact very very risky. But "civil disobidience" is not. Well, it may be risky, but it is valid. There is already strong support for the opinion (e.g. European battle against software patents), that current patent/copyright regulations, are devastating to inovations. And in fact contradict their 'intended goal' - the stimulation of inovations. At this point, there is no other way but to ignore such legislation - to the point where we can: like, by not reading patent files. Be warrned, when the law strikes back, you loose. Calming down a little, I'd add, that this particular piece of 16'th century law does not quite fit todays reality. Just my 2c .... and I admit, it's not on the subject in subject :) -- -R