#9455: Issues with aliases in WHERE and ORDER BY

slavakiev reported 2013-06-27T20:22:00Z · swissarmyknife(*) last modified 2013-09-27T21:27:17Z

Issues with aliases in WHERE and ORDER BY

doc

Dev	emil.goicovici
QA	tariqrahiman

Priority	Low
Complexity	Unknown

Component	MongoSQL
Version	14.0

Tested on Aqua Data Studio 14.0.0-beta-33 Build #: 32981 on Ubuntu 12.04( Mongo DB 2.4.3) and Windows XP ( Mongo DB 2.0.9).

Run next scriptset and see results:

insert into ordercoll(a,b) values (2,'2')

insert into ordercoll(a) values (1)

select ord.a from ordercoll ord order by ord.a

--a

--2

--1

--ORDER BY does not work

select a from ordercoll where a=b

--a

--2

select a as c from ordercoll where c=b

--c

--1

--should be same as upper

select ord.a as c from ordercoll ord where c=1

-c

--Empty result set but should be 1 row

select a from ordercoll ord where ord.a=ord.b

--Can not read property 'a' of undefined near 'ord.b'

All Comments (23) Change History

slavakiev 2013-06-27T20:57:01Z

More issues:

SELECT stats FROM baseball

WHERE stats.wins >= 80 AND stats.battingAvg > 0.25

--[ { "year" : 1904, "wins" : 100, "mostRbis" : 1000, "grade" : "A", "battingAvg" : 0.3 }, { "year" : 1987, "wins" : 80, "mostRbis" : 200, "grade" : "A", "battingAvg" : 0.267 } ]

SELECT "stats" as st FROM baseball

WHERE st.wins >= 80 AND st.battingAvg > 0.25

--st

--empty result set but should be row as in upper

SELECT "stats" as st FROM baseball WHERE st[wins >= 80 AND battingAvg > 0.25]

--[ { "year" : 1904, "wins" : 100, "mostRbis" : 1000, "grade" : "A", "battingAvg" : 0.3 }, { "year" : 1987, "wins" : 80, "mostRbis" : 200, "grade" : "A", "battingAvg" : 0.267 } ]

SELECT b."stats" as st FROM baseball b WHERE st[wins >= 80 AND battingAvg > 0.25]

--st

--empty result set but should be a row as in upper

JennyNishimura 2013-06-27T21:07:42Z

Slava, please note that the column alias is only a display label and can be used in GROUP BY, ORDER BY, and HAVING, but it cannot be used in the WHERE clause.

select a as c from ordercoll where c=b is invalid. It should be: select a as c from ordercoll where a=b

The following is from MySQL documentation:

A select_expr can be given an alias using AS alias_name. The alias is used as the expression's column name and can be used in GROUP BY, ORDER BY, or HAVING clauses.

slavakiev 2013-06-28T18:40:27Z

Ok, I leave only 2 cases:

insert into ordercoll(a,b) values (2,'2')

insert into ordercoll(a) values (1)

select ord.a from ordercoll ord order by ord.a

--a

--2

--1

--ORDER BY does not work

select a from ordercoll ord where ord.a=ord.b

--Can not read property 'a' of undefined near 'ord.b'

emil.goicovici 2013-07-04T20:35:23Z

I've fixed the JDBC driver implementation to support collection name/aliases prefixes on WHERE filters and ORDER BY rules.

On the other hand, I've completely removed the support for aliases on WHERE filters, because it can conflict with existing fields

e.g.

insert into testColl values({a:1, b:2, c:1})

go

select a, b as c from testColl where a=c

This means that from the initial scenarios, the ones with where c=b or where c=1 are not fixed.

If we support aliases on the WHERE filters, the query above won't return any result, which might generate some confusion. Therefore, aliases should be supported only on the clauses that are applied after projection ( HAVING , ORDER BY etc).

I've tested aliases under WHERE filters with the MySQL and I noticed that this RDBMS doesn't support them.

I've fixed the JDBC driver implementation to support collection name/aliases prefixes on WHERE filters and ORDER BY rules.

On the other hand, I've completely removed the support for aliases on WHERE filters, because it can conflict with existing fields

e.g.

insert into testColl values({a:1, b:2, c:1})

go

select a, b as c from testColl where a=c

This means that from the initial scenarios, the ones with where c=b or where c=1 are not fixed.

I've tested aliases under WHERE filters with the MySQL and I noticed that this RDBMS doesn't support them.

slavakiev 2013-07-09T12:37:43Z

Tested in Aqua Data Studio Aqua Data Studio 14.0.0-beta-59 Build #: 33157

SELECT ord.a as oa FROM ordercoll ord GROUP BY oa

returns no rows while next statements returns 4:

SELECT ord.a as oa FROM ordercoll ord GROUP BY ord.a

slavakiev 2013-07-12T11:45:37Z

Tested on Aqua Data Studio Aqua Data Studio 14.0.0-beta-64 Build #: 33188 on Ubuntu 12.04( Mongo DB 2.4.3) and Windows XP ( Mongo DB 2.0.9).

Once b field added even without alias then next error occures:

SELECT ord.a as oa,b FROM ordercoll ord GROUP BY oa,b

-- dotted field names are only allowed at the top level

slavakiev 2013-07-18T18:20:52Z

Tested Aqua Data Studio Aqua Data Studio 14.0.0-beta-68 Build #: 33238 on Ubuntu 12.04( Mongo DB 2.4.3) and Windows XP ( Mongo DB 2.0.9).

select ord.a as oa,ord.b as ob from ordercoll ord group by oa,ob having a=1 and b='1'

--empty result set

--but when same aliases then 1 row returns

select ord.a as oa,ord.b as ob from ordercoll ord group by oa,ob having oa=1 and ob='1'

oa ob

---- ----

1 1

emil.goicovici 2013-08-06T15:03:35Z

This scenario has been fixed.

slavakiev 2013-08-13T19:30:00Z

Tested Aqua Data Studio Aqua Data Studio 14.0.0-beta-82 Build #: 33525 on Ubuntu 12.04( Mongo DB 2.4.3)

There is still no universal solution. Necessary to subst all field aliases/name on _id. This will prevent to have many combinations :

drop collection ordercoll

insert into ordercoll(a,b) values (2,'2')

insert into ordercoll(a,b) values (1,'1')

insert into ordercoll(a) values (5)

select ord.a as oa from ordercoll ord group by oa order by a

--order by does not work

--oa

-------

--5

--1

--2

slavakiev 2013-08-17T20:37:15Z

Tested in Aqua Data Studio Aqua Data Studio 14.0.0-beta-79 Build #: 33402 on Ubuntu 12.04( Mongo DB 2.4.3) and Windows XP ( Mongo DB 2.0.9).

One more case. Run next scriptset and see result:

insert into reccoll(stats.year,stats.wins) values( 2001,21)

insert into reccoll(stats.year,stats.wins) values( 2000,20)

insert into reccoll(stats.year,stats.wins) values( 1999,19)

select stats as sts from reccoll order by sts.year

--wrong sort order

--sts

------------------------------

--{ "year" : 2001, "wins" : 21 }

--{ "year" : 2000, "wins" : 20 }

--{ "year" : 1999, "wins" : 19 }

select stats as sts from reccoll order by stats.year

--correct sort order

--sts

------------------------------

--{ "year" : 1999, "wins" : 19 }

--{ "year" : 2000, "wins" : 20 }

--{ "year" : 2001, "wins" : 21 }

See proposed solution in the ticket 9711.

emil.goicovici 2013-08-29T21:31:21Z

The solution proposed for issue #9711 cannot be applied as a general rule because sometimes the sort-by key should be the same as the output column (the alias name). Such case occurs for example when Aggregation Framework is used, the $sort pipeline operator being applied among the last ones, after any $project and $match operators. In this case the alias name should be used as the sort-by key.

I've fixed the failing scenario.

slavakiev 2013-09-02T19:07:23Z

Tested in Aqua Data Studio Aqua Data Studio 14.0.0-rc-8 Build #: 33812 on Ubuntu 12.04( Mongo DB 2.4.3)

Adding DESC to the next resolving query does not work:

select stats as sts from reccoll order by sts.year desc

--sts

------------------------------

--{ "year" : 1999, "wins" : 19 }

--{ "year" : 2000, "wins" : 20 }

--{ "year" : 2001, "wins" : 21 }

slavakiev 2013-09-09T18:31:25Z

Tested in Aqua Data Studio Aqua Data Studio 14.0.0-rc-13 Build #: 33892 on Ubuntu 12.04( Mongo DB 2.4.3)

Using same collection as above:

drop collection reccoll

insert into reccoll(stats.year,stats.wins) values( 2001,21)

insert into reccoll(stats.year,stats.wins) values( 2000,20)

insert into reccoll(stats.year,stats.wins) values( 1999,19)

select stats.year+1 as st from reccoll order by stats.year+1

--wrong order

select stats.year+1 from reccoll order by stats.year+1

--correct order

Also when alias has same name as field:

select valuation+1 as valuation from baseball

group by valuation having valuation=10

--empty result set

select valuation+1 as valuation from baseball

group by valuation

--returns 12 instead 11!

emil.goicovici 2013-09-12T14:09:37Z

When the alias of an arithmetic operation overlaps a group-by key, that arithmetic operation is now performed only once.

slavakiev 2013-09-17T10:54:53Z

Tested in Aqua Data Studio Aqua Data Studio 14.0.0-rc-20 Build #: 33993 on Ubuntu 12.04( Mongo DB 2.4.3)

Next statement from previous comment does not work:

select valuation+1 as valuation from baseball

group by valuation having valuation=10

--empty result set

Also when aliases used for the functions - they work but without aliases give error. See next example:

SELECT min(valuation) FROM baseball group by min(valuation)

--Invalid GROUP BY item: CALL

SELECT min(valuation) as mn FROM baseball group by mn

--returns 10

emil.goicovici 2013-09-18T08:51:35Z

Next statement from previous comment does not work:

select valuation+1 as valuation from baseball

group by valuation having valuation=10

--empty result set

This query returns empty resultset because the valuation expression under the HAVING clause refers to the alias, not the original field. Thus the HAVING filter becomes valuation+1=10 . There is no document in the baseball collection having the value 9 for the field valuation.

If you change your query to:

select valuation+1 as valuation from baseball group by valuation having valuation=11

then you'll get the expected result.

Also when aliases used for the functions - they work but without aliases give error. See next example:

SELECT min(valuation) FROM baseball group by min(valuation)

--Invalid GROUP BY item: CALL

Please open a separate issue for this problem. It is not related to the description of the current issue.

slavakiev 2013-09-18T10:07:46Z · (edited)

I tested this query( select valuation+1 as valuation from baseball

group by valuation having valuation=10 )

in MS SQL Server, Oracle and get 11. Because RDBMS uses alias in this case just as title for the output so in HAVING real field is used.

emil.goicovici 2013-09-18T10:27:30Z

in MS SQL Server and get 11. Because RDBMS uses alias in this case just as title for the output so in HAVING real field is used.

This doesn't make sense. On an aggregate SELECT query (i.e. when the GROUP BY is present), the WHERE clause is intended to be used to filter the documents before performing the GROUP BY operation, while the HAVING clause should be used to filter the columns resulted after this aggregation step.

For your given scenario, if we would consider the original field names, ambiguity can occur when the alias name is the same with the field name (i.e. the HAVING filter should apply to the valuation alias or to to the group-by key valuation field)?

I think the MongoDB JDBC driver becomes too heavy and tends to be claimed as a RDBMS itself, which is beyond its scope as a JDBC driver.

Inside MongoShell, when Aggregation Framework is used, when the $project pipeline operator is used (for renaming or arithmetic operations), subsequent pipeline operators (e.g. a $match one, corresponding to a HAVING clause) are expected to refer to the output fields only. Inside MongoSQL, we follow the same behaviour.

emil.goicovici 2013-09-18T10:54:27Z · (edited)

Implementing your request would mean to move the last $match pipeline operator corresponding to the HAVING clause right after the $group pipeline operator and before any subsequent $project ones. But this would break other legitimate queries users might expect to work (and that are currently working):

e.g.:

select valuation+1 as val from baseball group by valuation having val<10

As you can see, a general rule doesn't work in this case. Unlike RDBMS that process the query, perform the aggregation, filter the results and project the columns with the desired names (i.e. the column titles) in the same place (server-side), the MongoDB JDBC driver should perform all these steps through the quering API (i.e. the MongoShell-like commands) provided by MongoDB. Therefore, this JDBC driver can't ever implement all the functionality of a RDBMS (where the query processing/optimisation etc is performed in-house, inside the DB). Is rather a thin layer that allows quering MongoDB using SQL queries.

slavakiev 2013-09-18T11:10:31Z

This can be performed by internal naming of aliases. It can be also usefull in other cases not only this one. So valuation+1 you call saying "WHATvaluation" and in the lastest special $project rename it to the user's alias - "valuation" and do not touch $match etc. But necessary to have list of collection field names to see wheather or not alias matches any field name. It can be time expensive. Lets wait for Jenny or Sachin decission. Probably has sence to add comment to the documantation that if alias has same name as any field then field name will be substituted by alias in all clauses contrary as RDBMS working

emil.goicovici 2013-09-18T11:26:25Z

But necessary to have list of collection field names to see wheather or not alias matches any field name. It can be time expensive.

Obtaining the list of collection's field names is trivial for relational databases where a given table has a fixed schema (structured data) and each row has the same columns. For MongoDB, the unstructured nature of the documents belonging to the same collection means that we would have to iterate over all collection's documents (before performing the given query itself), which is unacceptable in my opinion.

JennyNishimura 2013-09-18T15:54:56Z

Let's document how alias names are applied in the HAVING and WHERE clauses.

Jonathan, please add the following note in the "HAVING clause" section of the MongoSQL Query Reference.

On an aggregate SELECT query (i.e. when the GROUP BY clause is present), the WHERE clause filters the documents before performing the GROUP BY operation while the HAVING clause filters the result set after the aggregation step. In the WHERE clause, the collection's field names are used when processing the WHERE filters. However, the field names specified in the HAVING clause are matched with the alias names specified in the SELECT fields first when processing the HAVING filters.

Examples:

SELECT worldChampionships+1 as worldChampionships FROM baseball
WHERE worldChampionships=9

The above query returns the document where the value of the worldChampionships field is 9.

SELECT worldChampionships+1 as worldChampionships FROM baseball
GROUP BY worldChampionships HAVING worldChampionships=9

The above query uses an alias worldChampionships for the value of worldChampionships+1. The worldChampionships field in the HAVING clause refers to the value of worldChampionships+1. Therefore, the query returns the document where worldChampionships+1=9.

SELECT worldChampionships+1 as worldChampionships FROM baseball
GROUP BY worldChampionships HAVING worldChampionships=10

The above query returns the document where worldChampionships+1=10.

swissarmyknife(*) 2013-09-27T21:27:17Z

Refresh to see its inclusion and let me know if this is okay.

Search Tips

Aqua Data Studio / nhilam

Issues with aliases in WHERE and ORDER BY

Issue #9455

Completion

2 issue links

Issue #10269

Issue #9453

Search Tips

Aqua Data Studio / nhilam

Title

Issues with aliases in WHERE and ORDER BY

Issue #9455

Completion

2 issue links

Issue #10269

Issue #9453