dev
Bill Sun 2 years ago
commit e55831ab5c

40
.gitignore vendored

@ -0,0 +1,40 @@
*.log
*.pyc
*.tab
out
.idea
.svn
*.iml
/mo_sql_parsing.egg-info
/build
/dist
/mo-sql-parsing
vendor/
._*
.DS_Store
.eggs
.vscode
out.k
k
*.so
*.pdf
test*.c*
*.csv
!test.csv
!moving_avg.csv
!nyctx100.csv
*.out
*.asm
!mmw.so
*.k
!header.k
!join.k
**/.vs
**/x64
*.user
*.filters
*.tmp
*.bin
*.shm
server/packages/**
*.ipynb

@ -0,0 +1,363 @@
Mozilla Public License, version 2.0
1. Definitions
1.1. "Contributor"
means each individual or legal entity that creates, contributes to the
creation of, or owns Covered Software.
1.2. "Contributor Version"
means the combination of the Contributions of others (if any) used by a
Contributor and that particular Contributor's Contribution.
1.3. "Contribution"
means Covered Software of a particular Contributor.
1.4. "Covered Software"
means Source Code Form to which the initial Contributor has attached the
notice in Exhibit A, the Executable Form of such Source Code Form, and
Modifications of such Source Code Form, in each case including portions
thereof.
1.5. "Incompatible With Secondary Licenses"
means
a. that the initial Contributor has attached the notice described in
Exhibit B to the Covered Software; or
b. that the Covered Software was made available under the terms of
version 1.1 or earlier of the License, but not also under the terms of
a Secondary License.
1.6. "Executable Form"
means any form of the work other than Source Code Form.
1.7. "Larger Work"
means a work that combines Covered Software with other material, in a
separate file or files, that is not Covered Software.
1.8. "License"
means this document.
1.9. "Licensable"
means having the right to grant, to the maximum extent possible, whether
at the time of the initial grant or subsequently, any and all of the
rights conveyed by this License.
1.10. "Modifications"
means any of the following:
a. any file in Source Code Form that results from an addition to,
deletion from, or modification of the contents of Covered Software; or
b. any new file in Source Code Form that contains any Covered Software.
1.11. "Patent Claims" of a Contributor
means any patent claim(s), including without limitation, method,
process, and apparatus claims, in any patent Licensable by such
Contributor that would be infringed, but for the grant of the License,
by the making, using, selling, offering for sale, having made, import,
or transfer of either its Contributions or its Contributor Version.
1.12. "Secondary License"
means either the GNU General Public License, Version 2.0, the GNU Lesser
General Public License, Version 2.1, the GNU Affero General Public
License, Version 3.0, or any later versions of those licenses.
1.13. "Source Code Form"
means the form of the work preferred for making modifications.
1.14. "You" (or "Your")
means an individual or a legal entity exercising rights under this
License. For legal entities, "You" includes any entity that controls, is
controlled by, or is under common control with You. For purposes of this
definition, "control" means (a) the power, direct or indirect, to cause
the direction or management of such entity, whether by contract or
otherwise, or (b) ownership of more than fifty percent (50%) of the
outstanding shares or beneficial ownership of such entity.
2. License Grants and Conditions
2.1. Grants
Each Contributor hereby grants You a world-wide, royalty-free,
non-exclusive license:
a. under intellectual property rights (other than patent or trademark)
Licensable by such Contributor to use, reproduce, make available,
modify, display, perform, distribute, and otherwise exploit its
Contributions, either on an unmodified basis, with Modifications, or
as part of a Larger Work; and
b. under Patent Claims of such Contributor to make, use, sell, offer for
sale, have made, import, and otherwise transfer either its
Contributions or its Contributor Version.
2.2. Effective Date
The licenses granted in Section 2.1 with respect to any Contribution
become effective for each Contribution on the date the Contributor first
distributes such Contribution.
2.3. Limitations on Grant Scope
The licenses granted in this Section 2 are the only rights granted under
this License. No additional rights or licenses will be implied from the
distribution or licensing of Covered Software under this License.
Notwithstanding Section 2.1(b) above, no patent license is granted by a
Contributor:
a. for any code that a Contributor has removed from Covered Software; or
b. for infringements caused by: (i) Your and any other third party's
modifications of Covered Software, or (ii) the combination of its
Contributions with other software (except as part of its Contributor
Version); or
c. under Patent Claims infringed by Covered Software in the absence of
its Contributions.
This License does not grant any rights in the trademarks, service marks,
or logos of any Contributor (except as may be necessary to comply with
the notice requirements in Section 3.4).
2.4. Subsequent Licenses
No Contributor makes additional grants as a result of Your choice to
distribute the Covered Software under a subsequent version of this
License (see Section 10.2) or under the terms of a Secondary License (if
permitted under the terms of Section 3.3).
2.5. Representation
Each Contributor represents that the Contributor believes its
Contributions are its original creation(s) or it has sufficient rights to
grant the rights to its Contributions conveyed by this License.
2.6. Fair Use
This License is not intended to limit any rights You have under
applicable copyright doctrines of fair use, fair dealing, or other
equivalents.
2.7. Conditions
Sections 3.1, 3.2, 3.3, and 3.4 are conditions of the licenses granted in
Section 2.1.
3. Responsibilities
3.1. Distribution of Source Form
All distribution of Covered Software in Source Code Form, including any
Modifications that You create or to which You contribute, must be under
the terms of this License. You must inform recipients that the Source
Code Form of the Covered Software is governed by the terms of this
License, and how they can obtain a copy of this License. You may not
attempt to alter or restrict the recipients' rights in the Source Code
Form.
3.2. Distribution of Executable Form
If You distribute Covered Software in Executable Form then:
a. such Covered Software must also be made available in Source Code Form,
as described in Section 3.1, and You must inform recipients of the
Executable Form how they can obtain a copy of such Source Code Form by
reasonable means in a timely manner, at a charge no more than the cost
of distribution to the recipient; and
b. You may distribute such Executable Form under the terms of this
License, or sublicense it under different terms, provided that the
license for the Executable Form does not attempt to limit or alter the
recipients' rights in the Source Code Form under this License.
3.3. Distribution of a Larger Work
You may create and distribute a Larger Work under terms of Your choice,
provided that You also comply with the requirements of this License for
the Covered Software. If the Larger Work is a combination of Covered
Software with a work governed by one or more Secondary Licenses, and the
Covered Software is not Incompatible With Secondary Licenses, this
License permits You to additionally distribute such Covered Software
under the terms of such Secondary License(s), so that the recipient of
the Larger Work may, at their option, further distribute the Covered
Software under the terms of either this License or such Secondary
License(s).
3.4. Notices
You may not remove or alter the substance of any license notices
(including copyright notices, patent notices, disclaimers of warranty, or
limitations of liability) contained within the Source Code Form of the
Covered Software, except that You may alter any license notices to the
extent required to remedy known factual inaccuracies.
3.5. Application of Additional Terms
You may choose to offer, and to charge a fee for, warranty, support,
indemnity or liability obligations to one or more recipients of Covered
Software. However, You may do so only on Your own behalf, and not on
behalf of any Contributor. You must make it absolutely clear that any
such warranty, support, indemnity, or liability obligation is offered by
You alone, and You hereby agree to indemnify every Contributor for any
liability incurred by such Contributor as a result of warranty, support,
indemnity or liability terms You offer. You may include additional
disclaimers of warranty and limitations of liability specific to any
jurisdiction.
4. Inability to Comply Due to Statute or Regulation
If it is impossible for You to comply with any of the terms of this License
with respect to some or all of the Covered Software due to statute,
judicial order, or regulation then You must: (a) comply with the terms of
this License to the maximum extent possible; and (b) describe the
limitations and the code they affect. Such description must be placed in a
text file included with all distributions of the Covered Software under
this License. Except to the extent prohibited by statute or regulation,
such description must be sufficiently detailed for a recipient of ordinary
skill to be able to understand it.
5. Termination
5.1. The rights granted under this License will terminate automatically if You
fail to comply with any of its terms. However, if You become compliant,
then the rights granted under this License from a particular Contributor
are reinstated (a) provisionally, unless and until such Contributor
explicitly and finally terminates Your grants, and (b) on an ongoing
basis, if such Contributor fails to notify You of the non-compliance by
some reasonable means prior to 60 days after You have come back into
compliance. Moreover, Your grants from a particular Contributor are
reinstated on an ongoing basis if such Contributor notifies You of the
non-compliance by some reasonable means, this is the first time You have
received notice of non-compliance with this License from such
Contributor, and You become compliant prior to 30 days after Your receipt
of the notice.
5.2. If You initiate litigation against any entity by asserting a patent
infringement claim (excluding declaratory judgment actions,
counter-claims, and cross-claims) alleging that a Contributor Version
directly or indirectly infringes any patent, then the rights granted to
You by any and all Contributors for the Covered Software under Section
2.1 of this License shall terminate.
5.3. In the event of termination under Sections 5.1 or 5.2 above, all end user
license agreements (excluding distributors and resellers) which have been
validly granted by You or Your distributors under this License prior to
termination shall survive termination.
6. Disclaimer of Warranty
Covered Software is provided under this License on an "as is" basis,
without warranty of any kind, either expressed, implied, or statutory,
including, without limitation, warranties that the Covered Software is free
of defects, merchantable, fit for a particular purpose or non-infringing.
The entire risk as to the quality and performance of the Covered Software
is with You. Should any Covered Software prove defective in any respect,
You (not any Contributor) assume the cost of any necessary servicing,
repair, or correction. This disclaimer of warranty constitutes an essential
part of this License. No use of any Covered Software is authorized under
this License except under this disclaimer.
7. Limitation of Liability
Under no circumstances and under no legal theory, whether tort (including
negligence), contract, or otherwise, shall any Contributor, or anyone who
distributes Covered Software as permitted above, be liable to You for any
direct, indirect, special, incidental, or consequential damages of any
character including, without limitation, damages for lost profits, loss of
goodwill, work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses, even if such party shall have been
informed of the possibility of such damages. This limitation of liability
shall not apply to liability for death or personal injury resulting from
such party's negligence to the extent applicable law prohibits such
limitation. Some jurisdictions do not allow the exclusion or limitation of
incidental or consequential damages, so this exclusion and limitation may
not apply to You.
8. Litigation
Any litigation relating to this License may be brought only in the courts
of a jurisdiction where the defendant maintains its principal place of
business and such litigation shall be governed by laws of that
jurisdiction, without reference to its conflict-of-law provisions. Nothing
in this Section shall prevent a party's ability to bring cross-claims or
counter-claims.
9. Miscellaneous
This License represents the complete agreement concerning the subject
matter hereof. If any provision of this License is held to be
unenforceable, such provision shall be reformed only to the extent
necessary to make it enforceable. Any law or regulation which provides that
the language of a contract shall be construed against the drafter shall not
be used to construe this License against a Contributor.
10. Versions of the License
10.1. New Versions
Mozilla Foundation is the license steward. Except as provided in Section
10.3, no one other than the license steward has the right to modify or
publish new versions of this License. Each version will be given a
distinguishing version number.
10.2. Effect of New Versions
You may distribute the Covered Software under the terms of the version
of the License under which You originally received the Covered Software,
or under the terms of any subsequent version published by the license
steward.
10.3. Modified Versions
If you create software not governed by this License, and you want to
create a new license for such software, you may create and use a
modified version of this License if you rename the license and remove
any references to the name of the license steward (except to note that
such modified license differs from this License).
10.4. Distributing Source Code Form that is Incompatible With Secondary
Licenses If You choose to distribute Source Code Form that is
Incompatible With Secondary Licenses under the terms of this version of
the License, the notice described in Exhibit B of this License must be
attached.
Exhibit A - Source Code Form License Notice
This Source Code Form is subject to the
terms of the Mozilla Public License, v.
2.0. If a copy of the MPL was not
distributed with this file, You can
obtain one at
http://mozilla.org/MPL/2.0/.
If it is not possible or desirable to put the notice in a particular file,
then You may include the notice in a location (such as a LICENSE file in a
relevant directory) where a recipient would be likely to look for such a
notice.
You may add additional accurate notices of copyright ownership.
Exhibit B - "Incompatible With Secondary Licenses" Notice
This Source Code Form is "Incompatible
With Secondary Licenses", as defined by
the Mozilla Public License, v. 2.0.

@ -0,0 +1,15 @@
OS_SUPPORT =
ifeq ($(OS),Windows_NT)
OS_SUPPORT += server/winhelper.cpp
endif
$(info $(OS_SUPPORT))
server.bin:
g++ server/server.cpp $(OS_SUPPORT) --std=c++1z -O3 -march=native -o server.bin
server.so:
g++ server/server.cpp -shared $(OS_SUPPORT) --std=c++1z -O3 -march=native -o server.so
snippet:
g++ -shared -fPIC --std=c++1z out.cpp -O3 -march=native -o dll.so
clean:
rm *.shm -rf

@ -0,0 +1,44 @@
# AQuery++ DB
## Introduction
AQuery++ Database is a cross-platform, In-Memory Column-Store Database that incorporates compiled query execution.
Compiler frontend built on top of [mo-sql-parsing](https://github.com/klahnakoski/mo-sql-parsing).
## Roadmap
- [x] SQL Parser -> AQuery Parser (Front End)
- [ ] AQuery-C++ Compiler (Back End)
- [x] Schema and Data Model
- [x] Data acquisition/output from/to csv file
- [x] Single table queries
- [x] Projections and Single Table Aggregations
- [x] Group by Aggregations
- [x] Filters
- [x] Order by
- [x] Assumption
- [x] Flatten
- [ ] Multi-table
- [ ] Join
- [ ] Subqueries
- [ ] -> Optimizing Compiler
## TODO:
- [ ] C++ Meta-Programming: Elimilate template recursions as much as possible.
- [ ] IPC: Better ways to communicate between Interpreter (Python) and Executer (C++).
- [ ] Sockets? stdin/stdout capture?
## Requirements
Recent version of Linux, Windows or MacOS, with recent C++ compiler that has C++17 (1z) support (e.g. gcc 6.0, MSVC 2017, clang 6.0), and python 3.6 or above.
## Usage
`python3 prompt.py` will launch the interactive command prompt. The server binary will be autometically rebuilt and started.
#### Commands:
- `<sql statement>`: parse sql statement
- `f <filename>`: parse all sql statements in file
- `print`: printout parsed sql statements
- `exec`: execute last parsed statement(s)
- `r`: run the last generated code snippet
- `save <OPTIONAL: filename>`: save current code snippet. will use random filename if not specified.
- `exit`: quit the prompt
#### Example:
`f moving_avg.a` <br>
`exec`

@ -0,0 +1,88 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
from __future__ import absolute_import, division, unicode_literals
import json
from threading import Lock
from aquery_parser.sql_parser import scrub
from aquery_parser.utils import ansi_string, simple_op, normal_op
parse_locker = Lock() # ENSURE ONLY ONE PARSING AT A TIME
common_parser = None
mysql_parser = None
sqlserver_parser = None
SQL_NULL = {"null": {}}
def parse(sql, null=SQL_NULL, calls=simple_op):
"""
:param sql: String of SQL
:param null: What value to use as NULL (default is the null function `{"null":{}}`)
:return: parse tree
"""
global common_parser
with parse_locker:
if not common_parser:
common_parser = sql_parser.common_parser()
result = _parse(common_parser, sql, null, calls)
return result
def parse_mysql(sql, null=SQL_NULL, calls=simple_op):
"""
PARSE MySQL ASSUME DOUBLE QUOTED STRINGS ARE LITERALS
:param sql: String of SQL
:param null: What value to use as NULL (default is the null function `{"null":{}}`)
:return: parse tree
"""
global mysql_parser
with parse_locker:
if not mysql_parser:
mysql_parser = sql_parser.mysql_parser()
return _parse(mysql_parser, sql, null, calls)
def parse_sqlserver(sql, null=SQL_NULL, calls=simple_op):
"""
PARSE MySQL ASSUME DOUBLE QUOTED STRINGS ARE LITERALS
:param sql: String of SQL
:param null: What value to use as NULL (default is the null function `{"null":{}}`)
:return: parse tree
"""
global sqlserver_parser
with parse_locker:
if not sqlserver_parser:
sqlserver_parser = sql_parser.sqlserver_parser()
return _parse(sqlserver_parser, sql, null, calls)
parse_bigquery = parse_mysql
def _parse(parser, sql, null, calls):
utils.null_locations = []
utils.scrub_op = calls
sql = sql.rstrip().rstrip(";")
parse_result = parser.parse_string(sql, parse_all=True)
output = scrub(parse_result)
for o, n in utils.null_locations:
o[n] = null
return output
_ = json.dumps
__all__ = ["parse", "format", "parse_mysql", "parse_bigquery", "normal_op", "simple_op"]

@ -0,0 +1,403 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
# SQL CONSTANTS
from mo_parsing import *
from aquery_parser.utils import SQL_NULL, keyword
NULL = keyword("null") / (lambda: SQL_NULL)
TRUE = keyword("true") / (lambda: True)
FALSE = keyword("false") / (lambda: False)
NOCASE = keyword("nocase")
ASC = keyword("asc")
DESC = keyword("desc")
# SIMPLE KEYWORDS
AS = keyword("as").suppress()
ASSUMING = keyword("assuming")
ALL = keyword("all")
BY = keyword("by").suppress()
CAST = keyword("cast")
CONSTRAINT = keyword("constraint").suppress()
CREATE = keyword("create").suppress()
CROSS = keyword("cross")
DISTINCT = keyword("distinct")
EXCEPT = keyword("except")
FETCH = keyword("fetch").suppress()
FROM = keyword("from").suppress()
FULL = keyword("full")
FUNCTION = keyword("function").suppress()
GROUP = keyword("group").suppress()
HAVING = keyword("having").suppress()
INNER = keyword("inner")
INTERVAL = keyword("interval")
JOIN = keyword("join")
LEFT = keyword("left")
LIKE = keyword("like")
LIMIT = keyword("limit").suppress()
MINUS = keyword("minus")
OFFSET = keyword("offset").suppress()
ON = keyword("on").suppress()
ORDER = keyword("order").suppress()
OUTER = keyword("outer")
OVER = keyword("over").suppress()
PARTITION = keyword("partition").suppress()
# PERCENT = keyword("percent").suppress()
RIGHT = keyword("right")
RLIKE = keyword("rlike")
SELECT = keyword("select").suppress()
TABLE = keyword("table").suppress()
THEN = keyword("then").suppress()
TOP = keyword("top").suppress()
UNION = keyword("union")
INTERSECT = keyword("intersect")
USING = keyword("using").suppress()
WHEN = keyword("when").suppress()
WHERE = keyword("where").suppress()
WITH = keyword("with").suppress()
WITHIN = keyword("within").suppress()
PRIMARY = keyword("primary").suppress()
FOREIGN = keyword("foreign").suppress()
KEY = keyword("key").suppress()
UNIQUE = keyword("unique").suppress()
INDEX = keyword("index").suppress()
REFERENCES = keyword("references").suppress()
RECURSIVE = keyword("recursive").suppress()
VALUES = keyword("values").suppress()
WINDOW = keyword("window")
INTO = keyword("into").suppress()
PRIMARY_KEY = Group(PRIMARY + KEY).set_parser_name("primary_key")
FOREIGN_KEY = Group(FOREIGN + KEY).set_parser_name("foreign_key")
# SIMPLE OPERATORS
CONCAT = Literal("||").set_parser_name("concat")
MUL = Literal("*").set_parser_name("mul")
DIV = Literal("/").set_parser_name("div")
MOD = Literal("%").set_parser_name("mod")
NEG = Literal("-").set_parser_name("neg")
ADD = Literal("+").set_parser_name("add")
SUB = Literal("-").set_parser_name("sub")
BINARY_NOT = Literal("~").set_parser_name("binary_not")
BINARY_AND = Literal("&").set_parser_name("binary_and")
BINARY_OR = Literal("|").set_parser_name("binary_or")
GTE = Literal(">=").set_parser_name("gte")
LTE = Literal("<=").set_parser_name("lte")
LT = Literal("<").set_parser_name("lt")
GT = Literal(">").set_parser_name("gt")
EEQ = (
# conservative equality https://github.com/klahnakoski/jx-sqlite/blob/dev/docs/Logical%20Equality.md#definitions
Literal("==") | Literal("=")
).set_parser_name("eq")
DEQ = (
# decisive equality
# https://sparkbyexamples.com/apache-hive/hive-relational-arithmetic-logical-operators/
Literal("<=>").set_parser_name("eq!")
)
IDF = (
# decisive equality
# https://prestodb.io/docs/current/functions/comparison.html#is-distinct-from-and-is-not-distinct-from
keyword("is distinct from").set_parser_name("eq!")
)
INDF = (
# decisive equality
# https://prestodb.io/docs/current/functions/comparison.html#is-distinct-from-and-is-not-distinct-from
keyword("is not distinct from").set_parser_name("ne!")
)
FASSIGN = Literal(":=").set_parser_name("fassign") # Assignment in UDFs
NEQ = (Literal("!=") | Literal("<>")).set_parser_name("neq")
LAMBDA = Literal("->").set_parser_name("lambda")
AND = keyword("and")
BETWEEN = keyword("between")
CASE = keyword("case").suppress()
COLLATE = keyword("collate")
END = keyword("end")
ELSE = keyword("else").suppress()
IN = keyword("in")
IS = keyword("is")
NOT = keyword("not")
OR = keyword("or")
LATERAL = keyword("lateral")
VIEW = keyword("view")
# COMPOUND KEYWORDS
joins = (
(
Optional(CROSS | OUTER | INNER | ((FULL | LEFT | RIGHT) + Optional(INNER | OUTER)))
+ JOIN
+ Optional(LATERAL)
)
| LATERAL + VIEW + Optional(OUTER)
) / (lambda tokens: " ".join(tokens).lower())
UNION_ALL = (UNION + ALL).set_parser_name("union_all")
WITHIN_GROUP = Group(WITHIN + GROUP).set_parser_name("within_group")
SELECT_DISTINCT = Group(SELECT + DISTINCT).set_parser_name("select distinct")
PARTITION_BY = Group(PARTITION + BY).set_parser_name("partition by")
GROUP_BY = Group(GROUP + BY).set_parser_name("group by")
ORDER_BY = Group(ORDER + BY).set_parser_name("order by")
# COMPOUND OPERATORS
AT_TIME_ZONE = Group(keyword("at") + keyword("time") + keyword("zone"))
NOT_BETWEEN = Group(NOT + BETWEEN).set_parser_name("not_between")
NOT_LIKE = Group(NOT + LIKE).set_parser_name("not_like")
NOT_RLIKE = Group(NOT + RLIKE).set_parser_name("not_rlike")
NOT_IN = Group(NOT + IN).set_parser_name("nin")
IS_NOT = Group(IS + NOT).set_parser_name("is_not")
_SIMILAR = keyword("similar")
_TO = keyword("to")
SIMILAR_TO = Group(_SIMILAR + _TO).set_parser_name("similar_to")
NOT_SIMILAR_TO = Group(NOT + _SIMILAR + _TO).set_parser_name("not_similar_to")
RESERVED = MatchFirst([
# ONY INCLUDE SINGLE WORDS
ALL,
AND,
AS,
ASC,
ASSUMING,
BETWEEN,
BY,
CASE,
COLLATE,
CONSTRAINT,
CREATE,
CROSS,
DESC,
DISTINCT,
EXCEPT,
ELSE,
END,
FALSE,
FETCH,
FOREIGN,
FROM,
FULL,
FUNCTION,
GROUP_BY,
GROUP,
HAVING,
IN,
INDEX,
INNER,
INTERSECT,
INTERVAL,
IS_NOT,
IS,
JOIN,
KEY,
LATERAL,
LEFT,
LIKE,
LIMIT,
MINUS,
NOCASE,
NOT,
NULL,
OFFSET,
ON,
OR,
ORDER,
OUTER,
OVER,
PARTITION,
PRIMARY,
REFERENCES,
RIGHT,
RLIKE,
SELECT,
THEN,
TRUE,
UNION,
UNIQUE,
USING,
WHEN,
WHERE,
WINDOW,
WITH,
WITHIN,
INTO,
])
L_INLINE = Literal("<k>").suppress()
R_INLINE = Literal("</k>").suppress()
LBRACE = Literal("{").suppress()
RBRACE = Literal("}").suppress()
LB = Literal("(").suppress()
RB = Literal(")").suppress()
EQ = Char("=").suppress()
join_keywords = {
"join",
"full join",
"cross join",
"inner join",
"left join",
"right join",
"full outer join",
"right outer join",
"left outer join",
}
precedence = {
# https://www.sqlite.org/lang_expr.html
"literal": -1,
"interval": 0,
"cast": 0,
"collate": 0,
"concat": 1,
"mul": 2,
"div": 1.5,
"mod": 2,
"neg": 3,
"add": 3,
"sub": 2.5,
"binary_not": 4,
"binary_and": 4,
"binary_or": 4,
"gte": 5,
"lte": 5,
"lt": 5,
"gt": 6,
"eq": 7,
"neq": 7,
"missing": 7,
"exists": 7,
"at_time_zone": 8,
"between": 8,
"not_between": 8,
"in": 8,
"nin": 8,
"is": 8,
"like": 8,
"not_like": 8,
"rlike": 8,
"not_rlike": 8,
"similar_to": 8,
"not_similar_to": 8,
"and": 10,
"or": 11,
"lambda": 12,
"join": 18,
"list": 18,
"function": 30,
"select": 30,
"from": 30,
"window": 35,
"union": 40,
"union_all": 40,
"except": 40,
"minus": 40,
"intersect": 40,
"order": 50,
}
KNOWN_OPS = [
COLLATE,
CONCAT,
MUL | DIV | MOD,
NEG,
ADD | SUB,
BINARY_NOT,
BINARY_AND,
BINARY_OR,
GTE | LTE | LT | GT,
EEQ | NEQ | DEQ | IDF | INDF,
AT_TIME_ZONE,
(BETWEEN, AND),
(NOT_BETWEEN, AND),
IN,
NOT_IN,
IS_NOT,
IS,
LIKE,
NOT_LIKE,
RLIKE,
NOT_RLIKE,
SIMILAR_TO,
NOT_SIMILAR_TO,
NOT,
AND,
OR,
LAMBDA,
]
times = ["now", "today", "tomorrow", "eod"]
durations = {
"microseconds": "microsecond",
"microsecond": "microsecond",
"microsecs": "microsecond",
"microsec": "microsecond",
"useconds": "microsecond",
"usecond": "microsecond",
"usecs": "microsecond",
"usec": "microsecond",
"us": "microsecond",
"milliseconds": "millisecond",
"millisecond": "millisecond",
"millisecon": "millisecond",
"mseconds": "millisecond",
"msecond": "millisecond",
"millisecs": "millisecond",
"millisec": "millisecond",
"msecs": "millisecond",
"msec": "millisecond",
"ms": "millisecond",
"seconds": "second",
"second": "second",
"secs": "second",
"sec": "second",
"s": "second",
"minutes": "minute",
"minute": "minute",
"mins": "minute",
"min": "minute",
"m": "minute",
"hours": "hour",
"hour": "hour",
"hrs": "hour",
"hr": "hour",
"h": "hour",
"days": "day",
"day": "day",
"d": "day",
"dayofweek": "dow",
"dow": "dow",
"weekday": "dow",
"weeks": "week",
"week": "week",
"w": "week",
"months": "month",
"month": "month",
"mons": "month",
"mon": "month",
"quarters": "quarter",
"quarter": "quarter",
"years": "year",
"year": "year",
"decades": "decade",
"decade": "decade",
"decs": "decade",
"dec": "decade",
"centuries": "century",
"century": "century",
"cents": "century",
"cent": "century",
"c": "century",
"millennia": "millennium",
"millennium": "millennium",
"mils": "millennium",
"mil": "millennium",
"epoch": "epoch",
}

@ -0,0 +1,658 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
from mo_parsing.helpers import restOfLine
from mo_parsing.infix import delimited_list
from mo_parsing.whitespaces import NO_WHITESPACE, Whitespace
from aquery_parser.keywords import *
from aquery_parser.types import get_column_type, time_functions
from aquery_parser.utils import *
from aquery_parser.windows import window
digit = Char("0123456789")
simple_ident = (
Char(FIRST_IDENT_CHAR)
+ Char(IDENT_CHAR)[...] # let's not support dashes in var_names.
)
simple_ident = Regex(simple_ident.__regex__()[1])
def common_parser():
combined_ident = Combine(delimited_list(
ansi_ident | mysql_backtick_ident | simple_ident, separator=".", combine=True,
)).set_parser_name("identifier")
return parser(ansi_string | mysql_doublequote_string, combined_ident)
def mysql_parser():
mysql_string = ansi_string | mysql_doublequote_string
mysql_ident = Combine(delimited_list(
mysql_backtick_ident | sqlserver_ident | simple_ident,
separator=".",
combine=True,
)).set_parser_name("mysql identifier")
return parser(mysql_string, mysql_ident)
def sqlserver_parser():
combined_ident = Combine(delimited_list(
ansi_ident
| mysql_backtick_ident
| sqlserver_ident
| Word(FIRST_IDENT_CHAR, IDENT_CHAR),
separator=".",
combine=True,
)).set_parser_name("identifier")
return parser(ansi_string, combined_ident, sqlserver=True)
def parser(literal_string, ident, sqlserver=False):
with Whitespace() as engine:
engine.add_ignore(Literal("--") + restOfLine)
engine.add_ignore(Literal("#") + restOfLine)
engine.add_ignore(Literal("/*") + SkipTo("*/", include=True))
var_name = ~RESERVED + ident
inline_kblock = (L_INLINE + SkipTo(R_INLINE, include=True))("c")
# EXPRESSIONS
expr = Forward()
column_type, column_definition, column_def_references = get_column_type(
expr, var_name, literal_string
)
# CASE
case = (
CASE
+ Group(ZeroOrMore(
(WHEN + expr("when") + THEN + expr("then")) / to_when_call
))("case")
+ Optional(ELSE + expr("else"))
+ END
) / to_case_call
switch = (
CASE
+ expr("value")
+ Group(ZeroOrMore(
(WHEN + expr("when") + THEN + expr("then")) / to_when_call
))("case")
+ Optional(ELSE + expr("else"))
+ END
) / to_switch_call
cast = (
Group(CAST("op") + LB + expr("params") + AS + column_type("params") + RB)
/ to_json_call
)
trim = (
Group(
keyword("trim").suppress()
+ LB
+ Optional(
(keyword("both") | keyword("trailing") | keyword("leading"))
/ (lambda t: t[0].lower())
)("direction")
+ (
assign("from", expr)
| expr("chars") + Optional(assign("from", expr))
)
+ RB
).set_parser_name("trim")
/ to_trim_call
)
_standard_time_intervals = MatchFirst([
keyword(d) / (lambda t: durations[t[0].lower()]) for d in durations.keys()
]).set_parser_name("duration")("params")
duration = (
real_num | int_num | literal_string
)("params") + _standard_time_intervals
interval = (
INTERVAL + ("'" + delimited_list(duration) + "'" | duration)
) / to_interval_call
timestamp = (
time_functions("op")
+ (
literal_string("params")
| MatchFirst([
keyword(t) / (lambda t: t.lower()) for t in times
])("params")
)
) / to_json_call
extract = (
keyword("extract")("op")
+ LB
+ (_standard_time_intervals | expr("params"))
+ FROM
+ expr("params")
+ RB
) / to_json_call
alias = Optional((
(
AS
+ (var_name("name") + Optional(LB + delimited_list(ident("col")) + RB))
| (
var_name("name")
+ Optional(
(LB + delimited_list(ident("col")) + RB)
| (AS + delimited_list(var_name("col")))
)
)
)
/ to_alias
)("name"))
named_column = Group(Group(expr)("value") + alias)
stack = (
keyword("stack")("op")
+ LB
+ int_num("width")
+ ","
+ delimited_list(expr)("args")
+ RB
) / to_stack
# ARRAY[foo],
# ARRAY < STRING > [foo, bar], INVALID
# ARRAY < STRING > [foo, bar],
create_array = (
keyword("array")("op")
+ Optional(LT.suppress() + column_type("type") + GT.suppress())
+ (
LB + delimited_list(Group(expr))("args") + RB
| (Literal("[") + delimited_list(Group(expr))("args") + Literal("]"))
)
)
if not sqlserver:
# SQL SERVER DOES NOT SUPPORT [] FOR ARRAY CONSTRUCTION (USED FOR IDENTIFIERS)
create_array = (
Literal("[") + delimited_list(Group(expr))("args") + Literal("]")
| create_array
)
create_array = create_array / to_array
create_map = (
keyword("map")
+ Literal("[")
+ expr("keys")
+ ","
+ expr("values")
+ Literal("]")
) / to_map
create_struct = (
keyword("struct")("op")
+ Optional(
LT.suppress() + delimited_list(column_type)("types") + GT.suppress()
)
+ LB
+ delimited_list(Group((expr("value") + alias) / to_select_call))("args")
+ RB
).set_parser_name("create struct") / to_struct
distinct = (
DISTINCT("op") + delimited_list(named_column)("params")
) / to_json_call
query = Forward().set_parser_name("query")
call_function = (
ident("op")
+ LB
+ Optional(Group(query) | delimited_list(Group(expr)))("params")
+ Optional(
(keyword("respect") | keyword("ignore"))("nulls")
+ keyword("nulls").suppress()
)
+ RB
).set_parser_name("call function") / to_json_call
with NO_WHITESPACE:
def scale(tokens):
return {"mul": [tokens[0], tokens[1]]}
scale_function = ((real_num | int_num) + call_function) / scale
scale_ident = ((real_num | int_num) + ident) / scale
compound = (
NULL
| TRUE
| FALSE
| NOCASE
| interval
| timestamp
| extract
| case
| switch
| cast
| distinct
| trim
| stack
| create_array
| create_map
| create_struct
| (LB + Group(query) + RB)
| (LB + Group(delimited_list(expr)) / to_tuple_call + RB)
| literal_string.set_parser_name("string")
| hex_num.set_parser_name("hex")
| scale_function
| scale_ident
| real_num.set_parser_name("float")
| int_num.set_parser_name("int")
| call_function
| Combine(var_name + Optional(".*"))
)
sort_column = (
expr("value").set_parser_name("sort1")
+ Optional(DESC("sort") | ASC("sort"))
+ Optional(assign("nulls", keyword("first") | keyword("last")))
)
window_clause, over_clause = window(expr, var_name, sort_column)
expr << (
(
Literal("*")
| infix_notation(
compound,
[
(
Literal("[").suppress() + expr + Literal("]").suppress(),
1,
LEFT_ASSOC,
to_offset,
),
(
Literal(".").suppress() + simple_ident,
1,
LEFT_ASSOC,
to_offset,
),
(window_clause, 1, LEFT_ASSOC, to_window_mod),
(
assign("filter", LB + WHERE + expr + RB),
1,
LEFT_ASSOC,
to_window_mod,
),
]
+ [
(
o,
1 if o in unary_ops else (3 if isinstance(o, tuple) else 2),
unary_ops.get(o, LEFT_ASSOC),
to_lambda if o is LAMBDA else to_json_operator,
)
for o in KNOWN_OPS
],
)
)("value").set_parser_name("expression")
)
select_column = (
Group(
expr("value") + alias | Literal("*")("value")
).set_parser_name("column")
/ to_select_call
)
table_source = Forward()
join = (
Group(joins)("op")
+ table_source("join")
+ Optional((ON + expr("on")) | (USING + expr("using")))
| (
Group(WINDOW)("op")
+ Group(var_name("name") + AS + over_clause("value"))("join")
)
) / to_join_call
fassign = Group(var_name("var") + Suppress(FASSIGN) + expr("expr") + Suppress(";"))("assignment")
fassigns = fassign + ZeroOrMore(fassign, Whitespace(white=" \t"))
fbody = (Optional(fassigns) + expr("ret"))
udf = (
FUNCTION
+ var_name("fname")
+ LB
+ Optional(delimited_list(var_name)("params"))
+ RB
+ LBRACE
+ fbody
+ RBRACE
)
selection = (
(SELECT + DISTINCT + ON + LB)
+ delimited_list(select_column)("distinct_on")
+ RB
+ delimited_list(select_column)("select")
| SELECT + DISTINCT + delimited_list(select_column)("select_distinct")
| (
SELECT
+ Optional(
TOP
+ expr("value")
+ Optional(keyword("percent"))("percent")
+ Optional(WITH + keyword("ties"))("ties")
)("top")
/ to_top_clause
+ delimited_list(select_column)("select")
)
)
row = (LB + delimited_list(Group(expr)) + RB) / to_row
values = VALUES + delimited_list(row) / to_values
unordered_sql = Group(
values
| selection
+ Optional(
(FROM + delimited_list(table_source) + ZeroOrMore(join))("from")
+ Optional(WHERE + expr("where"))
+ Optional(GROUP_BY + delimited_list(Group(named_column))("groupby"))
+ Optional(HAVING + expr("having"))
)
).set_parser_name("unordered sql")
with NO_WHITESPACE:
def mult(tokens):
amount = tokens["bytes"]
scale = tokens["scale"].lower()
return {
"bytes": amount
* {"b": 1, "k": 1_000, "m": 1_000_000, "g": 1_000_000_000}[scale]
}
ts_bytes = (
(real_num | int_num)("bytes") + Char("bBkKmMgG")("scale")
) / mult
tablesample = assign(
"tablesample",
LB
+ (
(
keyword("bucket")("op")
+ int_num("params")
+ keyword("out of")
+ int_num("params")
+ Optional(ON + expr("on"))
)
/ to_json_call
| (real_num | int_num)("percent") + keyword("percent")
| int_num("rows") + keyword("rows")
| ts_bytes
)
+ RB,
)
assumption = Group((ASC|DESC) ("sort") + var_name("value"))
assumptions = (ASSUMING + Group(delimited_list(assumption))("assumptions"))
table_source << Group(
((LB + query + RB) | stack | call_function | var_name)("value")
+ Optional(assumptions)
+ Optional(flag("with ordinality"))
+ Optional(tablesample)
+ alias
).set_parser_name("table_source") / to_table
rows = Optional(keyword("row") | keyword("rows"))
limit = (
Optional(assign("offset", expr) + rows)
& Optional(
FETCH
+ Optional(keyword("first") | keyword("next"))
+ expr("fetch")
+ rows
+ Optional(keyword("only"))
)
& Optional(assign("limit", expr))
)
outfile = Optional(
(
INTO
+ keyword("outfile").suppress()
+ literal_string ("loc")
+ Optional (
keyword("fields")
+ keyword("terminated")
+ keyword("by")
+ literal_string ("term")
)
)("outfile")
)
ordered_sql = (
(
(unordered_sql | (LB + query + RB))
+ ZeroOrMore(
Group(
(UNION | INTERSECT | EXCEPT | MINUS) + Optional(ALL | DISTINCT)
)("op")
+ (unordered_sql | (LB + query + RB))
)
)("union")
+ Optional(ORDER_BY + delimited_list(Group(sort_column))("orderby"))
+ limit
+ outfile
).set_parser_name("ordered sql") / to_union_call
with_expr = delimited_list(Group(
(
(var_name("name") + Optional(LB + delimited_list(ident("col")) + RB))
/ to_alias
)("name")
+ (AS + LB + (query | expr)("value") + RB)
))
query << (
Optional(assign("with recursive", with_expr) | assign("with", with_expr))
+ Group(ordered_sql)("query")
) / to_query
#####################################################################
# DML STATEMENTS
#####################################################################
# MySQL's index_type := Using + ( "BTREE" | "HASH" )
index_type = Optional(assign("using", ident("index_type")))
index_column_names = LB + delimited_list(var_name("columns")) + RB
column_def_delete = assign(
"on delete",
(keyword("cascade") | keyword("set null") | keyword("set default")),
)
table_def_foreign_key = FOREIGN_KEY + Optional(
Optional(var_name("index_name"))
+ index_column_names
+ column_def_references
+ Optional(column_def_delete)
)
index_options = ZeroOrMore(var_name)("table_constraint_options")
table_constraint_definition = Optional(CONSTRAINT + var_name("name")) + (
assign("primary key", index_type + index_column_names + index_options)
| (
Optional(flag("unique"))
+ Optional(INDEX | KEY)
+ Optional(var_name("name"))
+ index_type
+ index_column_names
+ index_options
)("index")
| assign("check", LB + expr + RB)
| table_def_foreign_key("foreign_key")
)
table_element = (
column_definition("columns") | table_constraint_definition("constraint")
)
create_table = (
keyword("create")
+ Optional(keyword("or") + flag("replace"))
+ Optional(flag("temporary"))
+ TABLE
+ Optional((keyword("if not exists") / (lambda: False))("replace"))
+ var_name("name")
+ Optional(LB + delimited_list(table_element) + RB)
+ ZeroOrMore(
assign("engine", EQ + var_name)
| assign("collate", EQ + var_name)
| assign("auto_increment", EQ + int_num)
| assign("comment", EQ + literal_string)
| assign("default character set", EQ + var_name)
| assign("default charset", EQ + var_name)
)
+ Optional(AS.suppress() + infix_notation(query, [])("query"))
)("create_table")
create_view = (
keyword("create")
+ Optional(keyword("or") + flag("replace"))
+ Optional(flag("temporary"))
+ VIEW.suppress()
+ Optional((keyword("if not exists") / (lambda: False))("replace"))
+ var_name("name")
+ AS
+ query("query")
)("create_view")
# CREATE INDEX a ON u USING btree (e);
create_index = (
keyword("create index")
+ Optional(keyword("or") + flag("replace"))(INDEX | KEY)
+ Optional((keyword("if not exists") / (lambda: False))("replace"))
+ var_name("name")
+ ON
+ var_name("table")
+ index_type
+ index_column_names
+ index_options
)("create index")
cache_options = Optional((
keyword("options").suppress()
+ LB
+ Dict(delimited_list(Group(
literal_string / (lambda tokens: tokens[0]["literal"])
+ Optional(EQ)
+ var_name
)))
+ RB
)("options"))
create_cache = (
keyword("cache").suppress()
+ Optional(flag("lazy"))
+ TABLE
+ var_name("name")
+ cache_options
+ Optional(AS + query("query"))
)("cache")
drop_table = (
keyword("drop table") + Optional(flag("if exists")) + var_name("table")
)("drop")
drop_view = (
keyword("drop view") + Optional(flag("if exists")) + var_name("view")
)("drop")
drop_index = (
keyword("drop index") + Optional(flag("if exists")) + var_name("index")
)("drop")
insert = (
keyword("insert").suppress()
+ (
flag("overwrite") + keyword("table").suppress()
| keyword("into").suppress() + Optional(keyword("table").suppress())
)
+ var_name("table")
+ Optional(LB + delimited_list(var_name)("columns") + RB)
+ Optional(flag("if exists"))
+ (values | query)("query")
) / to_insert_call
update = (
keyword("update")("op")
+ var_name("params")
+ assign("set", Dict(delimited_list(Group(var_name + EQ + expr))))
+ Optional(assign("where", expr))
) / to_json_call
delete = (
keyword("delete")("op")
+ keyword("from").suppress()
+ var_name("params")
+ Optional(assign("where", expr))
) / to_json_call
load = (
keyword("load")("op")
+ keyword("data").suppress()
+ keyword("infile")("loc")
+ literal_string ("file")
+ INTO
+ keyword("table").suppress()
+ var_name ("table")
+ Optional(
keyword("fields").suppress()
+ keyword("terminated").suppress()
+ keyword("by").suppress()
+ literal_string ("term")
)
) ("load")
sql_stmts = delimited_list( (
query
| (insert | update | delete | load)
| (create_table | create_view | create_cache | create_index)
| (drop_table | drop_view | drop_index)
)("stmts"), ";")
other_stmt = (
inline_kblock
| udf
) ("stmts")
stmts = ZeroOrMore(
sql_stmts
|other_stmt
| keyword(";").suppress() # empty stmt
)
return stmts.finalize()

@ -0,0 +1,223 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
# KNOWN TYPES
from mo_parsing import Forward, Group, Optional, MatchFirst, Literal, ZeroOrMore, export
from mo_parsing.infix import delimited_list, RIGHT_ASSOC, LEFT_ASSOC
from aquery_parser.keywords import (
RB,
LB,
NEG,
NOT,
BINARY_NOT,
NULL,
EQ,
KNOWN_OPS,
LT,
GT,
)
from aquery_parser.utils import (
keyword,
to_json_call,
int_num,
ansi_string,
ansi_ident,
assign,
flag,
)
_size = Optional(LB + int_num("params") + RB)
_sizes = Optional(LB + delimited_list(int_num("params")) + RB)
simple_types = Forward()
BIGINT = Group(keyword("bigint")("op") + Optional(_size)+Optional(flag("unsigned"))) / to_json_call
BOOL = Group(keyword("bool")("op")) / to_json_call
BOOLEAN = Group(keyword("boolean")("op")) / to_json_call
DOUBLE = Group(keyword("double")("op")) / to_json_call
FLOAT64 = Group(keyword("float64")("op")) / to_json_call
FLOAT = Group(keyword("float")("op")) / to_json_call
GEOMETRY = Group(keyword("geometry")("op")) / to_json_call
INTEGER = Group(keyword("integer")("op")) / to_json_call
INT = (keyword("int")("op") + _size) / to_json_call
INT32 = Group(keyword("int32")("op")) / to_json_call
INT64 = Group(keyword("int64")("op")) / to_json_call
REAL = Group(keyword("real")("op")) / to_json_call
TEXT = Group(keyword("text")("op")) / to_json_call
SMALLINT = Group(keyword("smallint")("op")) / to_json_call
STRING = Group(keyword("string")("op")) / to_json_call
BLOB = (keyword("blob")("op") + _size) / to_json_call
BYTES = (keyword("bytes")("op") + _size) / to_json_call
CHAR = (keyword("char")("op") + _size) / to_json_call
NCHAR = (keyword("nchar")("op") + _size) / to_json_call
VARCHAR = (keyword("varchar")("op") + _size) / to_json_call
VARCHAR2 = (keyword("varchar2")("op") + _size) / to_json_call
VARBINARY = (keyword("varbinary")("op") + _size) / to_json_call
TINYINT = (keyword("tinyint")("op") + _size) / to_json_call
UUID = Group(keyword("uuid")("op")) / to_json_call
DECIMAL = (keyword("decimal")("op") + _sizes) / to_json_call
DOUBLE_PRECISION = (
Group((keyword("double precision") / (lambda: "double_precision"))("op"))
/ to_json_call
)
NUMERIC = (keyword("numeric")("op") + _sizes) / to_json_call
NUMBER = (keyword("number")("op") + _sizes) / to_json_call
MAP_TYPE = (
keyword("map")("op") + LB + delimited_list(simple_types("params")) + RB
) / to_json_call
ARRAY_TYPE = (keyword("array")("op") + LB + simple_types("params") + RB) / to_json_call
DATE = keyword("date")
DATETIME = keyword("datetime")
DATETIME_W_TIMEZONE = keyword("datetime with time zone")
TIME = keyword("time")
TIMESTAMP = keyword("timestamp")
TIMESTAMP_W_TIMEZONE = keyword("timestamp with time zone")
TIMESTAMPTZ = keyword("timestamptz")
TIMETZ = keyword("timetz")
time_functions = DATE | DATETIME | TIME | TIMESTAMP | TIMESTAMPTZ | TIMETZ
# KNOWNN TIME TYPES
_format = Optional((ansi_string | ansi_ident)("params"))
DATE_TYPE = (DATE("op") + _format) / to_json_call
DATETIME_TYPE = (DATETIME("op") + _format) / to_json_call
DATETIME_W_TIMEZONE_TYPE = (DATETIME_W_TIMEZONE("op") + _format) / to_json_call
TIME_TYPE = (TIME("op") + _format) / to_json_call
TIMESTAMP_TYPE = (TIMESTAMP("op") + _format) / to_json_call
TIMESTAMP_W_TIMEZONE_TYPE = (TIMESTAMP_W_TIMEZONE("op") + _format) / to_json_call
TIMESTAMPTZ_TYPE = (TIMESTAMPTZ("op") + _format) / to_json_call
TIMETZ_TYPE = (TIMETZ("op") + _format) / to_json_call
simple_types << MatchFirst([
ARRAY_TYPE,
BIGINT,
BOOL,
BOOLEAN,
BLOB,
BYTES,
CHAR,
DATE_TYPE,
DATETIME_W_TIMEZONE_TYPE,
DATETIME_TYPE,
DECIMAL,
DOUBLE_PRECISION,
DOUBLE,
FLOAT64,
FLOAT,
GEOMETRY,
MAP_TYPE,
INTEGER,
INT,
INT32,
INT64,
NCHAR,
NUMBER,
NUMERIC,
REAL,
TEXT,
SMALLINT,
STRING,
TIME_TYPE,
TIMESTAMP_W_TIMEZONE_TYPE,
TIMESTAMP_TYPE,
TIMESTAMPTZ_TYPE,
TIMETZ_TYPE,
TINYINT,
UUID,
VARCHAR,
VARCHAR2,
VARBINARY,
])
CASTING = (Literal("::").suppress() + simple_types("params")).set_parser_name("cast")
KNOWN_OPS.insert(0, CASTING)
unary_ops = {
NEG: RIGHT_ASSOC,
NOT: RIGHT_ASSOC,
BINARY_NOT: RIGHT_ASSOC,
CASTING: LEFT_ASSOC,
}
def get_column_type(expr, var_name, literal_string):
column_definition = Forward()
column_type = Forward().set_parser_name("column type")
struct_type = (
keyword("struct")("op")
+ LT.suppress()
+ Group(delimited_list(column_definition))("params")
+ GT.suppress()
) / to_json_call
row_type = (
keyword("row")("op")
+ LB
+ Group(delimited_list(column_definition))("params")
+ RB
) / to_json_call
array_type = (
keyword("array")("op")
+ (
(
LT.suppress()
+ Group(delimited_list(column_type))("params")
+ GT.suppress()
)
| (LB + Group(delimited_list(column_type))("params") + RB)
)
) / to_json_call
column_type << (struct_type | row_type | array_type | simple_types)
column_def_identity = (
assign(
"generated",
(keyword("always") | keyword("by default") / (lambda: "by_default")),
)
+ keyword("as identity").suppress()
+ Optional(assign("start with", int_num))
+ Optional(assign("increment by", int_num))
)
column_def_references = assign(
"references", var_name("table") + LB + delimited_list(var_name)("columns") + RB,
)
column_options = ZeroOrMore(
((NOT + NULL) / (lambda: False))("nullable")
| (NULL / (lambda t: True))("nullable")
| flag("unique")
| flag("auto_increment")
| assign("comment", literal_string)
| assign("collate", Optional(EQ) + var_name)
| flag("primary key")
| column_def_identity("identity")
| column_def_references
| assign("check", LB + expr + RB)
| assign("default", expr)
).set_parser_name("column_options")
column_definition << Group(
var_name("name") + (column_type | var_name)("type") + column_options
).set_parser_name("column_definition")
return column_type, column_definition, column_def_references
export("aquery_parser.utils", unary_ops)

@ -0,0 +1,618 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
import ast
from mo_dots import is_data, is_null, Data, from_data
from mo_future import text, number_types, binary_type, flatten
from mo_imports import expect
from mo_parsing import *
from mo_parsing.utils import is_number, listwrap
unary_ops = expect("unary_ops")
class Call(object):
__slots__ = ["op", "args", "kwargs"]
def __init__(self, op, args, kwargs):
self.op = op
self.args = args
self.kwargs = kwargs
IDENT_CHAR = Regex("[@_$0-9A-Za-zÀ-ÖØ-öø-ƿ]").expr.parser_config.include
FIRST_IDENT_CHAR = "".join(set(IDENT_CHAR) - set("0123456789"))
SQL_NULL = Call("null", [], {})
null_locations = []
def keyword(keywords):
return And([
Keyword(k, caseless=True) for k in keywords.split(" ")
]).set_parser_name(keywords) / (lambda: keywords.replace(" ", "_"))
def flag(keywords):
"""
RETURN {keywords: True}
"""
return (keyword(keywords) / (lambda: True))(keywords.replace(" ", "_"))
def assign(key: str, value: ParserElement):
return keyword(key).suppress() + value(key.replace(" ", "_"))
def simple_op(op, args, kwargs):
if args is None:
kwargs[op] = {}
else:
kwargs[op] = args
return kwargs
def normal_op(op, args, kwargs):
output = Data(op=op)
args = listwrap(args)
if args and (not isinstance(args[0], dict) or args[0]):
output.args = args
if kwargs:
output.kwargs = kwargs
return from_data(output)
scrub_op = simple_op
def scrub(result):
if result is SQL_NULL:
return SQL_NULL
elif result == None:
return None
elif isinstance(result, text):
return result
elif isinstance(result, binary_type):
return result.decode("utf8")
elif isinstance(result, number_types):
return result
elif isinstance(result, Call):
kwargs = scrub(result.kwargs)
args = scrub(result.args)
if args is SQL_NULL:
null_locations.append((kwargs, result.op))
return scrub_op(result.op, args, kwargs)
elif isinstance(result, dict) and not result:
return result
elif isinstance(result, list):
output = [rr for r in result for rr in [scrub(r)]]
if not output:
return None
elif len(output) == 1:
return output[0]
else:
for i, v in enumerate(output):
if v is SQL_NULL:
null_locations.append((output, i))
return output
else:
# ATTEMPT A DICT INTERPRETATION
try:
kv_pairs = list(result.items())
except Exception as c:
print(c)
output = {k: vv for k, v in kv_pairs for vv in [scrub(v)] if not is_null(vv)}
if isinstance(result, dict) or output:
for k, v in output.items():
if v is SQL_NULL:
null_locations.append((output, k))
return output
return scrub(list(result))
def _chunk(values, size):
acc = []
for v in values:
acc.append(v)
if len(acc) == size:
yield acc
acc = []
if acc:
yield acc
def to_lambda(tokens):
params, op, expr = list(tokens)
return Call("lambda", [expr], {"params": list(params)})
def to_json_operator(tokens):
# ARRANGE INTO {op: params} FORMAT
length = len(tokens.tokens)
if length == 2:
if tokens.tokens[1].type.parser_name == "cast":
return Call("cast", list(tokens), {})
# UNARY OPERATOR
op = tokens.tokens[0].type.parser_name
if op == "neg" and is_number(tokens[1]):
return -tokens[1]
return Call(op, [tokens[1]], {})
elif length == 5:
# TRINARY OPERATOR
return Call(
tokens.tokens[1].type.parser_name, [tokens[0], tokens[2], tokens[4]], {}
)
op = tokens[1]
if not isinstance(op, text):
op = op.type.parser_name
op = binary_ops.get(op, op)
if op == "eq":
if tokens[2] is SQL_NULL:
return Call("missing", tokens[0], {})
elif tokens[0] is SQL_NULL:
return Call("missing", tokens[2], {})
elif op == "neq":
if tokens[2] is SQL_NULL:
return Call("exists", tokens[0], {})
elif tokens[0] is SQL_NULL:
return Call("exists", tokens[2], {})
elif op == "eq!":
if tokens[2] is SQL_NULL:
return Call("missing", tokens[0], {})
elif tokens[0] is SQL_NULL:
return Call("missing", tokens[2], {})
elif op == "ne!":
if tokens[2] is SQL_NULL:
return Call("exists", tokens[0], {})
elif tokens[0] is SQL_NULL:
return Call("exists", tokens[2], {})
elif op == "is":
if tokens[2] is SQL_NULL:
return Call("missing", tokens[0], {})
else:
return Call("exists", tokens[0], {})
elif op == "is_not":
if tokens[2] is SQL_NULL:
return Call("exists", tokens[0], {})
else:
return Call("missing", tokens[0], {})
operands = [tokens[0], tokens[2]]
binary_op = Call(op, operands, {})
if op in {"add", "mul", "and", "or"}:
# ASSOCIATIVE OPERATORS
acc = []
for operand in operands:
while isinstance(operand, ParseResults) and isinstance(operand.type, Group):
# PARENTHESES CAUSE EXTRA GROUP LAYERS
operand = operand[0]
if isinstance(operand, ParseResults) and isinstance(
operand.type, Forward
):
operand = operand[0]
if isinstance(operand, Call) and operand.op == op:
acc.extend(operand.args)
elif isinstance(operand, list):
acc.append(operand)
elif isinstance(operand, dict) and operand.get(op):
acc.extend(operand.get(op))
else:
acc.append(operand)
binary_op = Call(op, acc, {})
return binary_op
def to_offset(tokens):
expr, offset = tokens.tokens
return Call("get", [expr, offset], {})
def to_window_mod(tokens):
expr, window = tokens.tokens
return Call("value", [expr], {**window})
def to_tuple_call(tokens):
# IS THIS ONE VALUE IN (), OR MANY?
tokens = list(tokens)
if len(tokens) == 1:
return [tokens[0]]
if all(isinstance(r, number_types) for r in tokens):
return [tokens]
if all(
isinstance(r, number_types) or (is_data(r) and "literal" in r.keys())
for r in tokens
):
candidate = {"literal": [r["literal"] if is_data(r) else r for r in tokens]}
return candidate
return [tokens]
binary_ops = {
"::": "cast",
"COLLATE": "collate",
"||": "concat",
"*": "mul",
"/": "div",
"%": "mod",
"+": "add",
"-": "sub",
"&": "binary_and",
"|": "binary_or",
"<": "lt",
"<=": "lte",
">": "gt",
">=": "gte",
"=": "eq",
"==": "eq",
"is distinct from": "eq!", # https://sparkbyexamples.com/apache-hive/hive-relational-arithmetic-logical-operators/
"is_distinct_from": "eq!",
"is not distinct from": "ne!",
"is_not_distinct_from": "ne!",
"<=>": "eq!", # https://sparkbyexamples.com/apache-hive/hive-relational-arithmetic-logical-operators/
"!=": "neq",
"<>": "neq",
"not in": "nin",
"in": "in",
"is_not": "neq",
"is": "eq",
"similar_to": "similar_to",
"like": "like",
"rlike": "rlike",
"not like": "not_like",
"not_like": "not_like",
"not rlike": "not_rlike",
"not_rlike": "not_rlike",
"not_simlilar_to": "not_similar_to",
"or": "or",
"and": "and",
"->": "lambda",
"union": "union",
"union_all": "union_all",
"union all": "union_all",
"except": "except",
"minus": "minus",
"intersect": "intersect",
}
is_set_op = ("union", "union_all", "except", "minus", "intersect")
def to_trim_call(tokens):
frum = tokens["from"]
if not frum:
return Call("trim", [tokens["chars"]], {"direction": tokens["direction"]})
return Call(
"trim",
[frum],
{"characters": tokens["chars"], "direction": tokens["direction"]},
)
def to_json_call(tokens):
# ARRANGE INTO {op: params} FORMAT
op = tokens["op"].lower()
op = binary_ops.get(op, op)
params = tokens["params"]
if isinstance(params, (dict, str, int, Call)):
args = [params]
else:
args = list(params)
kwargs = {k: v for k, v in tokens.items() if k not in ("op", "params")}
return ParseResults(
tokens.type,
tokens.start,
tokens.end,
[Call(op, args, kwargs)],
tokens.failures,
)
def to_interval_call(tokens):
# ARRANGE INTO {interval: [amount, type]} FORMAT
params = tokens["params"]
if not params:
params = {}
if params.length() == 2:
return Call("interval", params, {})
return Call("add", [Call("interval", p, {}) for p in _chunk(params, size=2)], {})
def to_case_call(tokens):
cases = list(tokens["case"])
elze = tokens["else"]
if elze != None:
cases.append(elze)
return Call("case", cases, {})
def to_switch_call(tokens):
# CONVERT TO CLASSIC CASE STATEMENT
value = tokens["value"]
acc = []
for c in list(tokens["case"]):
acc.append(Call("when", [Call("eq", [value] + c.args, {})], c.kwargs))
elze = tokens["else"]
if elze != None:
acc.append(elze)
return Call("case", acc, {})
def to_when_call(tokens):
tok = tokens
return Call("when", [tok["when"]], {"then": tok["then"]})
def to_join_call(tokens):
op = " ".join(tokens["op"])
if tokens["join"]["name"]:
output = {op: {
"name": tokens["join"]["name"],
"value": tokens["join"]["value"],
}}
else:
output = {op: tokens["join"]}
output["on"] = tokens["on"]
output["using"] = tokens["using"]
return output
def to_expression_call(tokens):
if set(tokens.keys()) & {"over", "within", "filter"}:
return
return ParseResults(
tokens.type,
tokens.start,
tokens.end,
listwrap(tokens["value"]),
tokens.failures,
)
def to_over(tokens):
if not tokens:
return {}
def to_alias(tokens):
cols = tokens["col"]
name = tokens["name"]
if cols:
return {name: cols}
return name
def to_top_clause(tokens):
value = tokens["value"]
if not value:
return None
value = value.value()
if tokens["ties"]:
output = {}
output["ties"] = True
if tokens["percent"]:
output["percent"] = value
else:
output["value"] = value
return output
elif tokens["percent"]:
return {"percent": value}
else:
return [value]
def to_row(tokens):
columns = list(tokens)
if len(columns) > 1:
return {"select": [{"value": v[0]} for v in columns]}
else:
return {"select": {"value": columns[0]}}
def get_literal(value):
if isinstance(value, (int, float)):
return value
elif isinstance(value, Call):
return
elif value is SQL_NULL:
return value
elif "literal" in value:
return value["literal"]
def to_values(tokens):
rows = list(tokens)
if len(rows) > 1:
values = [
[get_literal(s["value"]) for s in listwrap(row["select"])] for row in rows
]
if all(flatten(values)):
return {"from": {"literal": values}}
return {"union_all": list(tokens)}
else:
return rows
def to_stack(tokens):
width = tokens["width"]
args = listwrap(tokens["args"])
return Call("stack", args, {"width": width})
def to_array(tokens):
types = list(tokens["type"])
args = list(tokens["args"])
output = Call("create_array", args, {})
if types:
output = Call("cast", [output, Call("array", types, {})], {})
return output
def to_map(tokens):
keys = tokens["keys"]
values = tokens["values"]
return Call("create_map", [keys, values], {})
def to_struct(tokens):
types = list(tokens["types"])
args = list(d for a in tokens["args"] for d in [a if a["name"] else a["value"]])
output = Call("create_struct", args, {})
if types:
output = Call("cast", [output, Call("struct", types, {})], {})
return output
def to_select_call(tokens):
expr = tokens["value"]
if expr == "*":
return ["*"]
try:
call = expr[0][0]
if call.op == "value":
return {"name": tokens["name"], "value": call.args, **call.kwargs}
except:
pass
def to_union_call(tokens):
unions = tokens["union"]
if isinstance(unions, dict):
return unions
elif unions.type.parser_name == "unordered sql":
output = {k: v for k, v in unions.items()} # REMOVE THE Group()
else:
unions = list(unions)
sources = [unions[i] for i in range(0, len(unions), 2)]
operators = ["_".join(unions[i]) for i in range(1, len(unions), 2)]
acc = sources[0]
last_union = None
for op, so in list(zip(operators, sources[1:])):
if op == last_union and "union" in op:
acc[op] = acc[op] + [so]
else:
acc = {op: [acc, so]}
last_union = op
if not tokens["orderby"] and not tokens["offset"] and not tokens["limit"]:
return acc
else:
output = {"from": acc}
output["orderby"] = tokens["orderby"]
output["limit"] = tokens["limit"]
output["offset"] = tokens["offset"]
output["fetch"] = tokens["fetch"]
output["outfile"] = tokens["outfile"]
return output
def to_insert_call(tokens):
options = {
k: v for k, v in tokens.items() if k not in ["columns", "table", "query"]
}
query = tokens["query"]
columns = tokens["columns"]
try:
values = query["from"]["literal"]
if values:
if columns:
data = [dict(zip(columns, row)) for row in values]
return Call("insert", [tokens["table"]], {"values": data, **options})
else:
return Call("insert", [tokens["table"]], {"values": values, **options})
except Exception:
pass
return Call(
"insert", [tokens["table"]], {"columns": columns, "query": query, **options}
)
def to_query(tokens):
output = tokens["query"][0]
try:
output["with"] = tokens["with"]
output["with_recursive"] = tokens["with_recursive"]
return output
except Exception as cause:
return
def to_table(tokens):
output = dict(tokens)
if len(list(output.keys())) > 1:
return output
else:
return output["value"]
def unquote(tokens):
val = tokens[0]
if val.startswith("'") and val.endswith("'"):
val = "'" + val[1:-1].replace("''", "\\'") + "'"
elif val.startswith('"') and val.endswith('"'):
val = '"' + val[1:-1].replace('""', '\\"') + '"'
elif val.startswith("`") and val.endswith("`"):
val = '"' + val[1:-1].replace("``", "`").replace('"', '\\"') + '"'
elif val.startswith("[") and val.endswith("]"):
val = '"' + val[1:-1].replace("]]", "]").replace('"', '\\"') + '"'
elif val.startswith("+"):
val = val[1:]
un = ast.literal_eval(val).replace(".", "\\.")
return un
def to_string(tokens):
val = tokens[0]
val = "'" + val[1:-1].replace("''", "\\'") + "'"
return {"literal": ast.literal_eval(val)}
# NUMBERS
real_num = (
Regex(r"[+-]?(\d+\.\d*|\.\d+)([eE][+-]?\d+)?").set_parser_name("float")
/ (lambda t: float(t[0]))
)
def parse_int(tokens):
if "e" in tokens[0].lower():
return int(float(tokens[0]))
else:
return int(tokens[0])
int_num = Regex(r"[+-]?\d+([eE]\+?\d+)?").set_parser_name("int") / parse_int
hex_num = (
Regex(r"0x[0-9a-fA-F]+").set_parser_name("hex") / (lambda t: {"hex": t[0][2:]})
)
# STRINGS
ansi_string = Regex(r"\'(\'\'|[^'])*\'") / to_string
mysql_doublequote_string = Regex(r'\"(\"\"|[^"])*\"') / to_string
# BASIC IDENTIFIERS
ansi_ident = Regex(r'\"(\"\"|[^"])*\"') / unquote
mysql_backtick_ident = Regex(r"\`(\`\`|[^`])*\`") / unquote
sqlserver_ident = Regex(r"\[(\]\]|[^\]])*\]") / unquote

@ -0,0 +1,107 @@
# encoding: utf-8
#
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this file,
# You can obtain one at http://mozilla.org/MPL/2.0/.
#
# Contact: Kyle Lahnakoski (kyle@lahnakoski.com)
#
from __future__ import absolute_import, division, unicode_literals
from mo_parsing.infix import delimited_list
from aquery_parser.keywords import *
from aquery_parser.utils import *
# https://docs.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-ver15
def _to_bound_call(tokens):
zero = tokens["zero"]
if zero:
return {"min": 0, "max": 0}
direction = scrub(tokens["direction"])
limit = scrub(tokens["limit"])
if direction == "preceding":
if limit == "unbounded":
return {"max": 0}
elif is_data(limit):
return {"min": {"neg": limit}, "max": 0}
else:
return {"min": -limit, "max": 0}
else: # following
if limit == "unbounded":
return {"min": 0}
elif is_data(limit):
return {"min": {"neg": limit}, "max": 0}
else:
return {"min": 0, "max": limit}
def _to_between_call(tokens):
minn = scrub(tokens["min"])
maxx = scrub(tokens["max"])
if maxx.get("max") == 0:
# following
return {
"min": minn.get("min"),
"max": maxx.get("min"),
}
elif minn.get("min") == 0:
# preceding
return {"min": minn.get("max"), "max": maxx.get("max")}
else:
return {
"min": minn.get("min"),
"max": maxx.get("max"),
}
UNBOUNDED = keyword("unbounded")
PRECEDING = keyword("preceding")
FOLLOWING = keyword("following")
CURRENT_ROW = keyword("current row")
ROWS = keyword("rows")
RANGE = keyword("range")
def window(expr, var_name, sort_column):
bound_row = (
CURRENT_ROW("zero")
| (UNBOUNDED | int_num)("limit") + (PRECEDING | FOLLOWING)("direction")
) / _to_bound_call
bound_expr = (
CURRENT_ROW("zero")
| (UNBOUNDED | expr)("limit") + (PRECEDING | FOLLOWING)("direction")
) / _to_bound_call
between_row = (
BETWEEN + bound_row("min") + AND + bound_row("max")
) / _to_between_call
between_expr = (
BETWEEN + bound_expr("min") + AND + bound_expr("max")
) / _to_between_call
row_clause = (ROWS.suppress() + (between_row | bound_row)) | (
RANGE.suppress() + (between_expr | bound_expr)
)
over_clause = (
LB
+ Optional(PARTITION_BY + delimited_list(Group(expr))("partitionby"))
+ Optional(ORDER_BY + delimited_list(Group(sort_column))("orderby"))
+ Optional(row_clause("range"))
+ RB
)
window_clause = Optional((
WITHIN_GROUP
+ LB
+ Optional(ORDER_BY + delimited_list(Group(sort_column))("orderby"))
+ RB
)("within")) + ((OVER + (over_clause | var_name) / to_over)("over"))
return window_clause, over_clause

1273
csv.h

File diff suppressed because it is too large Load Diff

@ -0,0 +1,45 @@
import mariadb
class dbconn:
def __init__(self) -> None:
self.db = None
self.cur = None
def clear(self):
drop_all = f'''
SET FOREIGN_KEY_CHECKS = 0;
SET @tables = NULL;
SELECT GROUP_CONCAT('`', table_schema, '`.`', table_name, '`') INTO @tables
FROM information_schema.tables
WHERE table_schema = '{self.db.database}';
SET @tables = CONCAT('DROP TABLE ', @tables);
PREPARE stmt FROM @tables;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
SET FOREIGN_KEY_CHECKS = 1;
'''
if self.db:
if not self.cur:
self.cur = self.db.cursor()
self.cur.execute(drop_all)
def connect(self, ip, password = '0508', user = 'root', db = 'db', port = 3306):
try:
self.db = mariadb.connect(
user = user,
password = password,
host = ip,
port = port,
database = db
)
self.cur = self.db.cursor()
except mariadb.Error as e:
print(e)
self.db = None
self.cur = None
def exec(self, sql, params = None):
self.cur.execute(sql)

File diff suppressed because one or more lines are too long

@ -0,0 +1,12 @@
from engine.ast import Context, ast_node
import engine.ddl, engine.projection
def initialize():
return Context()
def generate(ast, cxt):
for k in ast.keys():
if k in ast_node.types.keys():
root = ast_node.types[k](None, ast, cxt)
__all__ = ["initialize", "generate"]

@ -0,0 +1,369 @@
from engine.utils import base62uuid
from copy import copy
# replace column info with this later.
class ColRef:
def __init__(self, cname, _ty, cobj, cnt, table:'TableInfo', name, id, compound = False):
self.cname = cname # column object location
self.cxt_name = None # column object in context
self.type = _ty
self.cobj = cobj
self.cnt = cnt
self.table = table
self.name = name
self.id = id # position in table
self.order_pending = None # order_pending
self.compound = compound # compound field (list as a field)
self.views = []
self.aux_columns = [] # columns for temperary calculations
# e.g. order by, group by, filter by expressions
self.__arr__ = (cname, _ty, cobj, cnt, table, name, id)
def reference(self):
cxt = self.table.cxt
self.table.reference()
if self not in cxt.columns_in_context:
counter = 0
base_name = self.table.table_name + '_' + self.name
if base_name in cxt.columns_in_context.values():
while (f'{base_name}_{counter}') in cxt.columns_in_context.values():
counter += 1
base_name = f'{base_name}_{counter}'
self.cxt_name = base_name
cxt.columns_in_context[self] = base_name
# TODO: change this to cname;
cxt.emit(f'auto& {base_name} = *(ColRef<{self.type}> *)(&{self.table.cxt_name}->colrefs[{self.id}]);')
elif self.cxt_name is None:
self.cxt_name = cxt.columns_in_context[self]
return self.cxt_name
def __getitem__(self, key):
if type(key) is str:
return getattr(self, key)
else:
return self.__arr__[key]
def __setitem__(self, key, value):
self.__arr__[key] = value
def __str__(self):
return self.reference()
def __repr__(self):
return self.reference()
class TableInfo:
def __init__(self, table_name, cols, cxt:'Context'):
# statics
self.table_name = table_name
self.alias = set([table_name])
self.columns_byname = dict() # column_name, type
self.columns = []
self.cxt = cxt
self.cxt_name = None
self.views = set()
#keep track of temp vars
self.local_vars = dict()
self.rec = None
self.groupinfo = None
self.add_cols(cols)
# runtime
self.n_rows = 0 # number of cols
self.order = [] # assumptions
cxt.tables_byname[self.table_name] = self # construct reverse map
def reference(self):
if self not in self.cxt.tables_in_context:
counter = 0
base_name = self.table_name
if base_name in self.cxt.tables_in_context.values():
while (f'{base_name}_{counter}') in self.cxt.tables_in_context.values():
counter += 1
base_name = f'{base_name}_{counter}'
self.cxt_name = base_name
self.cxt.tables_in_context[self] = base_name
type_tags = '<'
for c in self.columns:
type_tags += c.type + ','
if type_tags.endswith(','):
type_tags = type_tags[:-1]
type_tags += '>'
self.cxt.emit(f'auto& {base_name} = *(TableInfo{type_tags} *)(cxt->tables["{self.table_name}"]);')
return self.cxt_name
def refer_all(self):
self.reference()
for c in self.columns:
c.reference()
def add_cols(self, cols, new = True):
for i, c in enumerate(cols):
self.add_col(c, new, i)
def add_col(self, c, new = True, i = 0):
_ty = c['type']
if new:
cname =f'get<{i}>({self.table_name})'
_ty = _ty if type(c) is ColRef else list(_ty.keys())[0]
col_object = ColRef(cname, _ty, c, 1, self,c['name'], len(self.columns))
else:
col_object = c
cname = c.cname
c.table = self
self.cxt.ccols_byname[cname] = col_object
self.columns_byname[c['name']] = col_object
self.columns.append(col_object)
def get_size(self):
size_tmp = 'tmp_sz_'+base62uuid(6)
self.cxt.emit(f'const auto& {size_tmp} = {self.columns[0].reference()}.size;')
return size_tmp
@property
def n_cols(self):
return len(self.columns)
def materialize_orderbys(self):
view_stack = ''
stack_name = ''
for o in self.order:
o.materialize()
if len(view_stack) == 0:
view_stack = o.view.name
stack_name = view_stack
else:
view_stack = view_stack+'['+ o.view.name +']'
# TODO: Optimize by doing everything in a stmt
if len(view_stack) > 0:
if len(self.order) > 1:
self.cxt.emit(f'{stack_name}:{view_stack}')
for c in self.columns:
c.order_pending = stack_name
self.order[0].node.view = stack_name
self.order.clear()
def get_col_d(self, col_name):
col = self.columns_byname[col_name]
if type(self.rec) is set:
self.rec.add(col)
return col
def get_ccolname_d(self, col_name):
return self.get_col_d(col_name).cname
def get_col(self, col_name):
self.materialize_orderbys()
col = self.get_col_d(col_name)
if type(col.order_pending) is str:
self.cxt.emit_no_flush(f'{col.cname}:{col.cname}[{col.order_pending}]')
col.order_pending = None
return col
def get_ccolname(self, col_name):
return self.get_col(col_name).cname
def add_alias(self, alias):
# TODO: Scoping of alias should be constrainted in the query.
if alias in self.cxt.tables_byname.keys():
print("Error: table alias already exists")
return
self.cxt.tables_byname[alias] = self
self.alias.add(alias)
def parse_col_names(self, colExpr, materialize = True, raw = False):
# get_col = self.get_col if materialize else self.get_col_d
parsedColExpr = colExpr.split('.')
ret = None
if len(parsedColExpr) <= 1:
ret = self.get_col_d(colExpr)
else:
datasource = self.cxt.tables_byname[parsedColExpr[0]]
if datasource is None:
raise ValueError(f'Table name/alias not defined{parsedColExpr[0]}')
else:
ret = datasource.parse_col_names(parsedColExpr[1], raw)
from engine.expr import index_expr
string = ret.reference() + index_expr
if self.groupinfo is not None and ret and ret in self.groupinfo.raw_groups:
string = f'get<{self.groupinfo.raw_groups.index(ret)}>({{y}})'
return string, ret if raw else string
class View:
def __init__(self, context, table = None, tmp = True):
self.table: TableInfo = table
self.name = 'v'+base62uuid(7)
if type(table) is TableInfo:
table.views.add(self)
self.context = context
def construct(self):
self.context.emit(f'{self.name}:()')
class Context:
function_head = '''
extern "C" int __DLLEXPORT__ dllmain(Context* cxt) {
using namespace std;
using namespace types;
'''
LOG_INFO = 'INFO'
LOG_ERROR = 'ERROR'
LOG_SILENT = 'SILENT'
def new(self):
self.tmp_names = set()
self.udf_map = dict()
self.headers = set(['\"./server/libaquery.h\"'])
self.finalized = False
# read header
self.ccode = ''
self.ccodelet = ''
with open('header.cxx', 'r') as outfile:
self.ccode = outfile.read()
# datasource will be availible after `from' clause is parsed
# and will be deactivated when the `from' is out of scope
self.datasource = None
self.ds_stack = []
self.scans = []
self.removing_scan = False
def __init__(self):
self.tables:list[TableInfo] = []
self.tables_byname = dict()
self.ccols_byname = dict()
self.gc_name = 'gc_' + base62uuid(4)
self.tmp_names = set()
self.udf_map = dict()
self.headers = set(['\"./server/libaquery.h\"'])
self.finalized = False
self.log_level = Context.LOG_SILENT
self.print = print
# read header
self.ccode = ''
self.ccodelet = ''
self.columns_in_context = dict()
self.tables_in_context = dict()
with open('header.cxx', 'r') as outfile:
self.ccode = outfile.read()
# datasource will be availible after `from' clause is parsed
# and will be deactivated when the `from' is out of scope
self.datasource = None
self.ds_stack = []
self.scans = []
self.removing_scan = False
def add_table(self, table_name, cols):
tbl = TableInfo(table_name, cols, self)
self.tables.append(tbl)
return tbl
def gen_tmptable(self):
from engine.utils import base62uuid
return f't{base62uuid(7)}'
def reg_tmp(self, name, f):
self.tmp_names.add(name)
self.emit(f"{self.gc_name}.reg({{{name}, 0,0{'' if f is None else ',{f}'}}});")
def define_tmp(self, typename, isPtr = True, f = None):
name = 'tmp_' + base62uuid()
if isPtr:
self.emit(f'auto* {name} = new {typename};')
self.reg_tmp(name, f)
else:
self.emit(f'auto {name} = {typename};')
return name
def emit(self, codelet):
self.ccode += self.ccodelet + codelet + '\n'
self.ccodelet = ''
def emit_no_flush(self, codelet):
self.ccode += codelet + '\n'
def emit_flush(self):
self.ccode += self.ccodelet + '\n'
self.ccodelet = ''
def emit_nonewline(self, codelet):
self.ccodelet += codelet
def datsource_top(self):
if len(self.ds_stack) > 0:
return self.ds_stack[-1]
else:
return None
def datasource_pop(self):
if len(self.ds_stack) > 0:
self.ds_stack.pop()
return self.ds_stack[-1]
else:
return None
def datasource_push(self, ds):
if type(ds) is TableInfo:
self.ds_stack.append(ds)
return ds
else:
return None
def remove_scan(self, scan, str_scan):
self.emit(str_scan)
self.scans.remove(scan)
def Info(self, msg):
if self.log_level.upper() == Context.LOG_INFO:
self.print(msg)
def Error(self, msg):
if self.log_level.upper() == Context.LOG_ERROR:
self.print(msg)
else:
self.Info(self, msg)
def finalize(self):
if not self.finalized:
headers = ''
for h in self.headers:
if h[0] != '"':
headers += '#include <' + h + '>\n'
else:
headers += '#include ' + h + '\n'
self.ccode = headers + self.function_head + self.ccode + 'return 0;\n}'
self.headers = set()
return self.ccode
def __str__(self):
self.finalize()
return self.ccode
def __repr__(self) -> str:
return self.__str__()
class ast_node:
types = dict()
header = []
def __init__(self, parent:"ast_node", node, context:Context = None):
self.context = parent.context if context is None else context
self.parent = parent
self.datasource = None
self.init(node)
self.produce(node)
self.spawn(node)
self.consume(node)
def emit(self, code):
self.context.emit(code)
def emit_no_ln(self, code):
self.context.emit_nonewline(code)
name = 'null'
# each ast node has 3 stages.
# `produce' generates info for child nodes
# `spawn' populates child nodes
# `consume' consumes info from child nodes and finalizes codegen
# For simple operators, there may not be need for some of these stages
def init(self, _):
pass
def produce(self, _):
pass
def spawn(self, _):
pass
def consume(self, _):
pass
# include classes in module as first order operators
def include(objs):
import inspect
for _, cls in inspect.getmembers(objs):
if inspect.isclass(cls) and issubclass(cls, ast_node) and not cls.name.startswith('_'):
ast_node.types[cls.name] = cls

@ -0,0 +1,128 @@
# code-gen for data decl languages
from engine.orderby import orderby
from engine.ast import ColRef, TableInfo, ast_node, Context, include
from engine.scan import scan
from engine.utils import base62uuid
class create_table(ast_node):
name = 'create_table'
def __init__(self, parent: "ast_node", node, context: Context = None, cexprs = None, lineage = False):
self.cexprs = cexprs
self.lineage = lineage
super().__init__(parent, node, context)
def produce(self, node):
if type(node) is not TableInfo:
ct = node[self.name]
tbl = self.context.add_table(ct['name'], ct['columns'])
else:
tbl = node
col_type_str = ','.join([c.type for c in tbl.columns])
# create tables in c
self.emit(f"auto {tbl.table_name} = new TableInfo<{col_type_str}>(\"{tbl.table_name}\", {tbl.n_cols});")
self.emit("cxt->tables.insert({\"" + tbl.table_name + f"\", {tbl.table_name}"+"});")
self.context.tables_in_context[tbl] = tbl.table_name
tbl.cxt_name = tbl.table_name
tbl.refer_all()
# create an empty new table
if self.cexprs is None:
for c in tbl.columns:
self.emit(f'{c.cxt_name}.init("{c.name}");')
# create an output table
else:
# 1 to 1 lineage.
if len(self.context.scans) == 0:
if self.lineage:
order = 'order_' + base62uuid(6)
self.emit(f'auto {order} = {self.parent.datasource.cxt_name}->order_by<{orderby(self.parent, self.parent.assumptions).result()}>();')
self.lineage = '*' + order
else:
self.lineage = None
for i, c in enumerate(tbl.columns):
self.emit(f'{c.cxt_name}.init("{c.name}");')
self.emit(f"{c.cxt_name} = {self.cexprs[i](self.lineage)};")
self.lineage = None
self.parent.assumptions = None
else:
scanner:scan = self.context.scans[-1]
if self.lineage:
lineage_var = 'lineage_' + base62uuid(6)
counter_var = 'counter_' + base62uuid(6)
scanner.add(f'auto {lineage_var} = {self.datasource.cxt_name}->bind({tbl.cxt_name});', "init")
scanner.add(f'auto {counter_var} = 0;', "init")
scanner.add(f"{lineage_var}.emplace_back({counter_var}++);", "front")
self.lineage = f"{lineage_var}.rid"
for i, c in enumerate(tbl.columns):
scanner.add(f'{c.cxt_name}.init("{c.name}");', "init")
scanner.add(f"{c.cxt_name} = {self.cexprs[i](scanner.it_ver)};")
class insert(ast_node):
name = 'insert'
def produce(self, node):
ct = node[self.name]
table:TableInfo = self.context.tables_byname[ct]
values = node['query']['select']
if len(values) != table.n_cols:
raise ValueError("Column Mismatch")
table.refer_all()
for i, s in enumerate(values):
if 'value' in s:
cname = table.columns[i].cxt_name
self.emit(f"{cname}.emplace_back({s['value']});")
else:
# subquery, dispatch to select astnode
pass
class c(ast_node):
name='c'
def produce(self, node):
self.emit(node[self.name])
class load(ast_node):
name="load"
def produce(self, node):
self.context.headers.add('"csv.h"')
node = node[self.name]
table:TableInfo = self.context.tables_byname[node['table']]
table.refer_all()
csv_reader_name = 'csv_reader_' + base62uuid(6)
col_types = [c.type for c in table.columns]
col_tmp_names = ['tmp_'+base62uuid(8) for _ in range(len(table.columns))]
# col_type_str = ",".join(col_types)
col_names = ','.join([f'"{c.name}"' for c in table.columns])
self.emit(f'io::CSVReader<{len(col_types)}> {csv_reader_name}("{node["file"]["literal"]}");')
self.emit(f'{csv_reader_name}.read_header(io::ignore_extra_column, {col_names});')
for t, n in zip(col_types, col_tmp_names):
self.emit(f'{t} {n};')
self.emit(f'while({csv_reader_name}.read_row({",".join(col_tmp_names)})) {{ \n')
for i, c in enumerate(table.columns):
self.emit(f'{c.cxt_name}.emplace_back({col_tmp_names[i]});')
self.emit('}')
class outfile(ast_node):
name="_outfile"
def produce(self, node):
out_table:TableInfo = self.parent.out_table
filename = node['loc']['literal'] if 'loc' in node else node['literal']
sep = ',' if 'term' not in node else node['term']['literal']
file_pointer = 'fp_' + base62uuid(6)
self.emit(f'FILE* {file_pointer} = fopen("{filename}", "w");')
self.emit(f'{out_table.cxt_name}->printall("{sep}", "\\n", nullptr, {file_pointer});')
self.emit(f'fclose({file_pointer});')
# self.context.headers.add('fstream')
# cout_backup_buffer = 'stdout_' + base62uuid(4)
# ofstream = 'ofstream_' + base62uuid(6)
# self.emit(f'auto {cout_backup_buffer} = cout.rdbuf();')
# self.emit(f'auto {ofstream} = ofstream("{filename}");')
# self.emit(f'cout.rdbuf({ofstream}.rdbuf());')
# TODO: ADD STMTS.
# self.emit(f'cout.rdbuf({cout_backup_buffer});')
# self.emit(f'{ofstream}.close();')
import sys
include(sys.modules[__name__])

@ -0,0 +1,131 @@
from engine.ast import ast_node, ColRef
start_expr = 'f"'
index_expr = '{\'\' if x is None and y is None else f\'[{x}]\'}'
end_expr = '"'
class expr(ast_node):
name='expr'
builtin_func_maps = {
'max': 'max',
'min': 'min',
'avg': 'avg',
'sum': 'sum',
'count' : 'count',
'mins': ['mins', 'minw'],
'maxs': ['maxs', 'maxw'],
'avgs': ['avgs', 'avgw'],
'sums': ['sums', 'sumw'],
}
binary_ops = {
'sub':'-',
'add':'+',
'mul':'*',
'div':'/',
'mod':'%',
'and':'&&',
'or':'||',
'xor' : '^',
'gt':'>',
'lt':'<',
'le':'<=',
'gt':'>='
}
compound_ops = {
}
unary_ops = {
'neg' : '-',
'not' : '!'
}
coumpound_generating_ops = ['avgs', 'mins', 'maxs', 'sums'] + \
list( binary_ops.keys()) + list(compound_ops.keys()) + list(unary_ops.keys() )
def __init__(self, parent, node, materialize_cols = True, abs_col = False):
self.materialize_cols = materialize_cols
self.raw_col = None
self.__abs = abs_col
self.inside_agg = False
if(type(parent) is expr):
self.inside_agg = parent.inside_agg
self.__abs = parent.__abs
ast_node.__init__(self, parent, node, None)
def init(self, _):
from engine.projection import projection
parent = self.parent
self.isvector = parent.isvector if type(parent) is expr else False
self.is_compound = parent.is_compound if type(parent) is expr else False
if type(parent) in [projection, expr]:
self.datasource = parent.datasource
else:
self.datasource = self.context.datasource
self.udf_map = parent.context.udf_map
self._expr = ''
self.cexpr = None
self.func_maps = {**self.udf_map, **self.builtin_func_maps}
def produce(self, node):
if type(node) is dict:
for key, val in node.items():
if key in self.func_maps:
# TODO: distinguish between UDF agg functions and other UDF functions.
self.inside_agg = True
self.context.headers.add('"./server/aggregations.h"')
if type(val) is list and len(val) > 1:
cfunc = self.func_maps[key]
cfunc = cfunc[len(val) - 1] if type(cfunc) is list else cfunc
self._expr += f"{cfunc}("
for i, p in enumerate(val):
self._expr += expr(self, p)._expr + (','if i<len(val)-1 else '')
else:
funcname = self.func_maps[key]
funcname = funcname[0] if type(funcname) is list else funcname
self._expr += f"{funcname}("
self._expr += expr(self, val)._expr
self._expr += ')'
self.inside_agg = False
elif key in self.binary_ops:
l = expr(self, val[0])._expr
r = expr(self, val[1])._expr
self._expr += f'({l}{self.binary_ops[key]}{r})'
elif key in self.compound_ops:
x = []
if type(val) is list:
for v in val:
x.append(expr(self, v)._expr)
self._expr = self.compound_ops[key][1](x)
elif key in self.unary_ops:
self._expr += f'{self.unary_ops[key]}({expr(self, val)._expr})'
else:
self.context.Error(f'Undefined expr: {key}{val}')
if key in self.coumpound_generating_ops and not self.is_compound:
self.is_compound = True
p = self.parent
while type(p) is expr and not p.is_compound:
p.is_compound = True
p = p.parent
elif type(node) is str:
p = self.parent
while type(p) is expr and not p.isvector:
p.isvector = True
p = p.parent
self._expr, self.raw_col = self.datasource.parse_col_names(node, self.materialize_cols, True)
self.raw_col = self.raw_col if type(self.raw_col) is ColRef else None
if self.__abs and self.raw_col:
self._expr = self.raw_col.reference() + ("" if self.inside_agg else index_expr)
elif type(node) is bool:
self._expr = '1' if node else '0'
else:
self._expr = f'{node}'
def toCExpr(_expr):
return lambda x = None, y = None : eval(start_expr + _expr + end_expr)
def consume(self, _):
self.cexpr = expr.toCExpr(self._expr)
def __str__(self):
return self.cexpr

@ -0,0 +1,72 @@
from engine.ast import ColRef, TableInfo, ast_node
from engine.orderby import assumption
from engine.scan import scan
from engine.utils import base62uuid
from engine.expr import expr
class groupby(ast_node):
name = '_groupby'
def init(self, _):
self.context.headers.add('"./server/hasher.h"')
self.context.headers.add('unordered_map')
self.group = 'g' + base62uuid(7)
self.group_type = 'record_type' + base62uuid(7)
self.datasource = self.parent.datasource
self.scanner = None
self.datasource.rec = set()
self.raw_groups = []
def produce(self, node):
if type(node) is not list:
node = [node]
g_contents = ''
g_contents_list = []
first_col = ''
for i, g in enumerate(node):
v = g['value']
e = expr(self, v)
if type(e.raw_col) is ColRef:
self.raw_groups.append(e.raw_col)
e = e._expr
# if v is compound expr, create tmp cols
if type(v) is not str:
tmpcol = 't' + base62uuid(7)
self.emit(f'auto {tmpcol} = {e};')
e = tmpcol
if i == 0:
first_col = e
g_contents_list.append(e)
g_contents_decltype = [f'decltype({c})' for c in g_contents_list]
g_contents = expr.toCExpr(','.join(g_contents_list))
self.emit(f'typedef record<{expr.toCExpr(",".join(g_contents_decltype))(0)}> {self.group_type};')
self.emit(f'unordered_map<{self.group_type}, vector_type<uint32_t>, '
f'transTypes<{self.group_type}, hasher>> {self.group};')
self.n_grps = len(node)
self.scanner = scan(self, self.datasource, expr.toCExpr(first_col)()+'.size')
self.scanner.add(f'{self.group}[forward_as_tuple({g_contents(self.scanner.it_ver)})].emplace_back({self.scanner.it_ver});')
def consume(self, _):
self.referenced = self.datasource.rec
self.datasource.rec = None
self.scanner.finalize()
def deal_with_assumptions(self, assumption:assumption, out:TableInfo):
gscanner = scan(self, self.group)
val_var = 'val_'+base62uuid(7)
gscanner.add(f'auto &{val_var} = {gscanner.it_ver}.second;')
gscanner.add(f'{self.datasource.cxt_name}->order_by<{assumption.result()}>(&{val_var});')
gscanner.finalize()
def finalize(self, cexprs, out:TableInfo):
gscanner = scan(self, self.group)
key_var = 'key_'+base62uuid(7)
val_var = 'val_'+base62uuid(7)
gscanner.add(f'auto &{key_var} = {gscanner.it_ver}.first;')
gscanner.add(f'auto &{val_var} = {gscanner.it_ver}.second;')
gscanner.add(';\n'.join([f'{out.columns[i].reference()}.emplace_back({ce(x=val_var, y=key_var)})' for i, ce in enumerate(cexprs)])+';')
gscanner.finalize()
self.datasource.groupinfo = None

@ -0,0 +1,6 @@
from engine.ast import ast_node
class join(ast_node):
name='join'

@ -0,0 +1,65 @@
from engine.ast import ColRef, TableInfo, View, ast_node, Context
from engine.utils import base62uuid, seps
from engine.expr import expr
class order_item:
def __init__(self, name, node, order = True):
self.name = name
self.order = order
self.node = node
self.materialized = False
def materialize(self):
if not self.materialized:
self.name = expr(self.node, self.name, False).cexpr
self.materialized = True
return ('' if self.order else '-') + f'({self.name})'
def __str__(self):
return self.name
def __repr__(self):
return self.__str__()
class orderby(ast_node):
name = '_orderby'
def __init__(self, parent: "ast_node", node, context: Context = None):
self.col_list = []
super().__init__(parent, node, context)
def init(self, _):
self.datasource = self.parent.datasource
self.order = []
self.view = ''
def produce(self, node):
if type(node) is not list:
node = [node]
for n in node:
order = not ('sort' in n and n['sort'] == 'desc')
col_id = self.datasource.columns_byname[n['value']].id
col_id = col_id if order else -col_id-1
if col_id not in self.col_list:
self.col_list.append(col_id)
self.order.append(order_item(n['value'], self, order))
def merge(self, node):
self.produce(node)
def finialize(self, references):
self.order = [ o for o in self.order if o.name in references ]
def result(self, sep:str = ','):
return sep.join([f"{c}" for c in self.col_list])
class assumption(orderby):
name = '_assumption'
def __init__(self, parent: "ast_node", node, context: Context = None, exclude = []):
self.exclude = exclude
super().__init__(parent, node, context)
def produce(self, node):
if type(node) is not list:
node = [node]
[n for n in node if n not in self.exclude]
return super().produce(node)
def empty(self):
return len(self.col_list) == 0

@ -0,0 +1,180 @@
from engine.ast import ColRef, TableInfo, ast_node, Context, include
from engine.groupby import groupby
from engine.join import join
from engine.expr import expr
from engine.orderby import assumption, orderby
from engine.scan import filter
from engine.utils import base62uuid, enlist, base62alp, has_other
from engine.ddl import create_table, outfile
import copy
class projection(ast_node):
name='select'
def __init__(self, parent:ast_node, node, context:Context = None, outname = None, disp = True):
self.disp = disp
self.outname = outname
self.group_node = None
self.assumptions = None
self.where = None
ast_node.__init__(self, parent, node, context)
def init(self, _):
if self.outname is None:
self.outname = self.context.gen_tmptable()
def produce(self, node):
p = node['select']
self.projections = p if type(p) is list else [p]
self.context.Info(node)
def spawn(self, node):
self.datasource = None
if 'from' in node:
from_clause = node['from']
if type(from_clause) is list:
# from joins
join(self, from_clause)
elif type(from_clause) is dict:
if 'value' in from_clause:
value = from_clause['value']
if type(value) is dict:
if 'select' in value:
# from subquery
projection(self, from_clause, disp = False)
else:
# TODO: from func over table
print(f'from func over table{node}')
elif type(value) is str:
self.datasource = self.context.tables_byname[value]
if 'name' in value:
self.datasource.add_alias(value['name'])
if 'assumptions' in from_clause:
self.assumptions = enlist(from_clause['assumptions'])
elif type(from_clause) is str:
self.datasource = self.context.tables_byname[from_clause]
if self.datasource is None:
raise ValueError('spawn error: from clause')
if self.datasource is not None:
self.datasource_changed = True
self.prev_datasource = self.context.datasource
self.context.datasource = self.datasource
if 'where' in node:
self.where = filter(self, node['where'], True)
# self.datasource = filter(self, node['where'], True).output
# self.context.datasource = self.datasource
if 'groupby' in node:
self.group_node = groupby(self, node['groupby'])
self.datasource = copy.copy(self.datasource) # shallow copy
self.datasource.groupinfo = self.group_node
else:
self.group_node = None
def consume(self, node):
self.inv = True
disp_varname = 'd'+base62uuid(7)
has_groupby = self.group_node is not None
cexprs = []
flatten = False
cols = []
self.out_table = TableInfo('out_'+base62uuid(4), [], self.context)
if 'outfile' in node:
flatten = True
new_names = []
proj_raw_cols = []
for i, proj in enumerate(self.projections):
cname = ''
compound = False
self.datasource.rec = set()
if type(proj) is dict:
if 'value' in proj:
e = proj['value']
sname = expr(self, e)
if type(sname.raw_col) is ColRef:
proj_raw_cols.append(sname.raw_col)
sname = sname._expr
fname = expr.toCExpr(sname) # fastest access method at innermost context
absname = expr(self, e, abs_col=True)._expr # absolute name at function scope
# TODO: Make it single pass here.
compound = True # compound column
cexprs.append(fname)
cname = e if type(e) is str else ''.join([a if a in base62alp else '' for a in expr.toCExpr(absname)()])
if 'name' in proj: # renaming column by AS keyword
cname = proj['name']
new_names.append(cname)
elif type(proj) is str:
col = self.datasource.get_col_d(proj)
if type(col) is ColRef:
col.reference()
compound = compound and has_groupby and has_other(self.datasource.rec, self.group_node.referenced)
self.datasource.rec = None
typename = f'decays<decltype({absname})>'
if not compound:
typename = f'value_type<{typename}>'
cols.append(ColRef(cname, expr.toCExpr(typename)(), self.out_table, 0, None, cname, i, compound=compound))
self.out_table.add_cols(cols, False)
lineage = None
if has_groupby:
create_table(self, self.out_table) # creates empty out_table.
if self.assumptions is not None:
self.assumptions = assumption(self, self.assumptions, exclude=self.group_node.raw_groups)
if not self.assumptions.empty():
self.group_node.deal_with_assumptions(self.assumptions, self.out_table)
self.assumptions = None
self.group_node.finalize(cexprs, self.out_table)
else:
# if all assumptions in projections, treat as orderby
lineage = self.assumptions is not None and has_other(self.assumptions, proj_raw_cols)
spawn = create_table(self, self.out_table, cexprs = cexprs, lineage = lineage) # create and populate out_table.
if lineage and type(spawn.lineage) is str:
lineage = spawn.lineage
self.assumptions = orderby(self, self.assumptions) # do not exclude proj_raw_cols
else:
lineage = None
if self.where is not None:
self.where.finalize()
if type(lineage) is str:
order = 'order_' + base62uuid(6)
self.emit(f'auto {order} = {self.datasource.cxt_name}->order_by<{self.assumptions.result()}>({lineage});')
self.emit(f'{self.out_table.cxt_name}->materialize(*{order});')
self.assumptions = None
if self.assumptions is not None:
orderby_node = orderby(self, self.assumptions)
else:
orderby_node = None
if 'orderby' in node:
self.datasource = self.out_table
self.context.datasource = self.out_table # discard current ds
orderbys = node['orderby']
orderby_node = orderby(self, orderbys) if orderby_node is None else orderby_node.merge(orderbys)
if orderby_node is not None:
self.emit(f'auto {disp_varname} = {self.out_table.reference()}->order_by_view<{orderby_node.result()}>();')
else:
disp_varname = f'*{self.out_table.cxt_name}'
if self.disp:
self.emit(f'print({disp_varname});')
if flatten:
outfile(self, node['outfile'])
if self.datasource_changed:
self.context.datasource = self.prev_datasource
import sys
include(sys.modules[__name__])

@ -0,0 +1,99 @@
from xmlrpc.client import Boolean
from engine.ast import ColRef, TableInfo, View, ast_node, Context
from engine.utils import base62uuid
from engine.expr import expr
class scan(ast_node):
name = 'scan'
def __init__(self, parent: "ast_node", node, size = None, context: Context = None, const = False):
self.type = type
self.size = size
self.const = "const " if const else ""
super().__init__(parent, node, context)
def init(self, _):
self.datasource = self.context.datasource
self.initializers = ''
self.start = ''
self.front = ''
self.body = ''
self.end = '}'
self.mode = None
self.filters = []
scan_vars = set(s.it_var for s in self.context.scans)
self.it_ver = 'i' + base62uuid(2)
while(self.it_ver in scan_vars):
self.it_ver = 'i' + base62uuid(6)
self.parent.context.scans.append(self)
def produce(self, node):
if type(node) is ColRef:
self.colref = node
if self.size is None:
self.mode = ["col", node.table]
self.start += f'for ({self.const}auto& {self.it_ver} : {node.reference()}) {{\n'
else:
self.mode = ["idx", node.table]
self.start += f"for (uint32_t {self.it_ver} = 0; {self.it_ver} < {node.reference()}.size; ++{self.it_ver}){{\\n"
elif type(node) is str:
self.mode = ["idx", None]
self.start+= f'for({self.const}auto& {self.it_ver} : {node}) {{\n'
else:
self.mode = ["idx", node] # Node is the TableInfo
self.start += f"for (uint32_t {self.it_ver} = 0; {self.it_ver} < {self.size}; ++{self.it_ver}){{\n"
def add(self, stmt, position = "body"):
if position == "body":
self.body += stmt + '\n'
elif position == "init":
self.initializers += stmt + '\n'
else:
self.front += stmt + '\n'
def finalize(self):
for f in self.filters:
self.start += f
self.end += '}'
self.context.remove_scan(self, self.initializers + self.start + self.front + self.body + self.end)
class filter(ast_node):
name = 'filter'
def __init__(self, parent: "ast_node", node, materialize = False, context = None):
self.materialize = materialize
super().__init__(parent, node, context)
def init(self, _):
self.datasource = self.context.datasource
self.view = View(self.context, self.datasource)
self.value = None
def spawn(self, node):
# TODO: deal with subqueries
self.modified_node = node
return super().spawn(node)
def __materialize__(self):
if self.materialize:
cols = [] if self.datasource is None else self.datasource.columns
self.output = TableInfo('tn'+base62uuid(6), cols, self.context)
self.output.construct()
if type(self.value) is View: # cond filtered on tables.
self.emit(f'{self.value.name}:&{self.value.name}')
for o, c in zip(self.output.columns,self.value.table.columns):
self.emit(f'{o.cname}:{c.cname}[{self.value.name}]')
elif self.value is not None: # cond is scalar
tmpVar = 't'+base62uuid(7)
self.emit(f'{tmpVar}:{self.value}')
for o, c in zip(self.output.columns, self.datasource.columns):
self.emit(f'{o.cname}:$[{tmpVar};{c.cname};()]')
def finalize(self):
self.scanner.finalize()
def consume(self, _):
# TODO: optimizations after converting expr to cnf
self.scanner = None
for s in self.context.scans:
if self.datasource == s.mode[1]:
self.scanner = s
break
if self.scanner is None:
self.scanner = scan(self, self.datasource, self.datasource.get_size())
self.expr = expr(self, self.modified_node)
self.scanner.filters.append(f'if ({self.expr.cexpr(self.scanner.it_ver)}) {{\n')

@ -0,0 +1,34 @@
from engine.ast import Context
class Types:
name = 'Any'
cname = 'void*'
ctype_name = "types::NONE"
def __init__(self, context:Context):
self.cxt = context
def cast_to(self, *_):
return self
def cast_from(self, *_):
return self
def __repr__(self) -> str:
return self.cname
class String(Types):
name = 'String'
cname = 'const char*'
ctype_name = "types::ASTR"
def cast_from(self, ty, val, container = None):
if type(ty) is Int:
self.cxt.emit()
class Int(Types):
name = "Int"
cname = "int"
ctype_name = "types::AINT"
class Float(Types):
name = "Float"
cname = "float"
ctype_name = "types::AFLOAT"

@ -0,0 +1,25 @@
import uuid
base62alp = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
def base62uuid(crop=8):
id = uuid.uuid4().int
ret = ''
while id:
ret = base62alp[id % 62] + ret
id //= 62
return ret[:crop] if len(ret) else '0'
def enlist(l):
return l if type(l) is list else [l]
def seps(s, i, l):
return s if i < len(l) - 1 else ''
def has_other(a, b):
for ai in a:
if ai not in b:
return True
return False

@ -0,0 +1,19 @@
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" lang="zh-CN"><head><meta content="text/html; charset=UTF-8" http-equiv="Content-Type"><meta content="/images/branding/googleg/1x/googleg_standard_color_128dp.png" itemprop="image"><title>Google</title><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){window.google={kEI:'6SlvYryXC8qH-Aa284e4Dg',kEXPI:'0,1302536,56873,6059,206,4804,2316,383,246,5,1354,4013,1238,1122515,1197738,380753,16114,28684,893,16679,4858,1362,9291,3029,17579,4998,13228,3847,4192,6430,7432,14680,629,2370,4304,1279,2742,149,1103,840,1986,210,4101,109,4011,2023,1777,522,6341,8327,3227,2845,7,17450,7540,8780,4465,3783,9359,3,576,6459,149,13975,4,1528,2304,32112,2658,7357,13658,4437,16786,5824,2533,4094,17,4035,3,3541,1,36574,5580,2,15953,5589,744,5852,9865,1758,5679,1021,2380,2719,11305,6955,2,6,7755,4568,6258,3014,3,3706,16695,1249,5838,14967,1555,2778,4677,1412,1395,445,2,2,1,1509,22318,799,152,1417,1,8591,2566,2558,5,1272,181,799,189,2259,511,84,858,10779,7341,2621,29,3421,2,1,10310,770,264,847,829,4829,1630,22,588,23,879,4222,335,1,1427,789,2155,350,537,163,4633,3,131,76,4,486,4,2912,151,1573,395,1662,1520,633,1206,2516,130,843,9,733,1126,488,3,435,128,44,646,606,482,3,1022,157,808,855,168,8,5432026,355,126,103,8796585,1326,3,1877,1,2562,1,748,141,795,6164,6751,155,17,13,72,139,4,2,20,2,169,13,19,46,5,39,644,29,2,2,1,2,1,2,2,7,4,1,2,2,2,2,2,2,1052,1,1,158,3,2,2,2,2,2,4,2,3,3,2011,1002,312,222,8,8,277,8,4,3,85,3,3,3,1,1,1,1,4,17,2,2,3,3,6,4,2,11,4,9,4,2,8,6,5,5,13,2,7,7,23950912,4041690,3,450,4899,159,1358,4726,3,2664,422,3302,1103,3406,492,112,1481',kBL:'WaFM'};google.sn='webhp';google.kHL='zh-CN';})();(function(){
var f=this||self;var h,k=[];function l(a){for(var b;a&&(!a.getAttribute||!(b=a.getAttribute("eid")));)a=a.parentNode;return b||h}function m(a){for(var b=null;a&&(!a.getAttribute||!(b=a.getAttribute("leid")));)a=a.parentNode;return b}
function n(a,b,c,d,g){var e="";c||-1!==b.search("&ei=")||(e="&ei="+l(d),-1===b.search("&lei=")&&(d=m(d))&&(e+="&lei="+d));d="";!c&&f._cshid&&-1===b.search("&cshid=")&&"slh"!==a&&(d="&cshid="+f._cshid);c=c||"/"+(g||"gen_204")+"?atyp=i&ct="+a+"&cad="+b+e+"&zx="+Date.now()+d;/^http:/i.test(c)&&"https:"===window.location.protocol&&(google.ml&&google.ml(Error("a"),!1,{src:c,glmm:1}),c="");return c};h=google.kEI;google.getEI=l;google.getLEI=m;google.ml=function(){return null};google.log=function(a,b,c,d,g){if(c=n(a,b,c,d,g)){a=new Image;var e=k.length;k[e]=a;a.onerror=a.onload=a.onabort=function(){delete k[e]};a.src=c}};google.logUrl=n;}).call(this);(function(){
google.y={};google.sy=[];google.x=function(a,b){if(a)var c=a.id;else{do c=Math.random();while(google.y[c])}google.y[c]=[a,b];return!1};google.sx=function(a){google.sy.push(a)};google.lm=[];google.plm=function(a){google.lm.push.apply(google.lm,a)};google.lq=[];google.load=function(a,b,c){google.lq.push([[a],b,c])};google.loadAll=function(a,b){google.lq.push([a,b])};google.bx=!1;google.lx=function(){};}).call(this);google.f={};(function(){
document.documentElement.addEventListener("submit",function(b){var a;if(a=b.target){var c=a.getAttribute("data-submitfalse");a="1"===c||"q"===c&&!a.elements.q.value?!0:!1}else a=!1;a&&(b.preventDefault(),b.stopPropagation())},!0);document.documentElement.addEventListener("click",function(b){var a;a:{for(a=b.target;a&&a!==document.documentElement;a=a.parentElement)if("A"===a.tagName){a="1"===a.getAttribute("data-nohref");break a}a=!1}a&&b.preventDefault()},!0);}).call(this);</script><style>#gbar,#guser{font-size:13px;padding-top:1px !important;}#gbar{height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh{height:0;position:absolute;top:24px;width:100%}@media all{.gb1{height:22px;margin-right:.5em;vertical-align:top}#gbar{float:left}}a.gb1,a.gb4{text-decoration:underline !important}a.gb1,a.gb4{color:#00c !important}.gbi .gb4{color:#dd8e27 !important}.gbf .gb4{color:#900 !important}
</style><style>body,td,a,p,.h{font-family:arial,sans-serif}body{margin:0;overflow-y:scroll}#gog{padding:3px 8px 0}td{line-height:.8em}.gac_m td{line-height:17px}form{margin-bottom:20px}.h{color:#1558d6}em{color:#c5221f;font-style:normal;font-weight:normal}a em{text-decoration:underline}.lst{height:25px;width:496px}.gsfi,.lst{font:18px arial,sans-serif}.gsfs{font:17px arial,sans-serif}.ds{display:inline-box;display:inline-block;margin:3px 0 4px;margin-left:4px}input{font-family:inherit}body{background:#fff;color:#000}a{color:#4b11a8;text-decoration:none}a:hover,a:active{text-decoration:underline}.fl a{color:#1558d6}a:visited{color:#4b11a8}.sblc{padding-top:5px}.sblc a{display:block;margin:2px 0;margin-left:13px;font-size:11px}.lsbb{background:#f8f9fa;border:solid 1px;border-color:#dadce0 #70757a #70757a #dadce0;height:30px}.lsbb{display:block}#WqQANb a{display:inline-block;margin:0 12px}.lsb{background:url(/images/nav_logo229.png) 0 -261px repeat-x;border:none;color:#000;cursor:pointer;height:30px;margin:0;outline:0;font:15px arial,sans-serif;vertical-align:top}.lsb:active{background:#dadce0}.lst:focus{outline:none}</style><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){window.google.erd={jsr:1,bv:1576,de:true};
var f=this||self;var g,h=null!=(g=f.mei)?g:1,m,n=null!=(m=f.sdo)?m:!0,p=0,q,r=google.erd,u=r.jsr;google.ml=function(a,b,d,k,c){c=void 0===c?2:c;b&&(q=a&&a.message);if(google.dl)return google.dl(a,c,d),null;if(0>u){window.console&&console.error(a,d);if(-2===u)throw a;b=!1}else b=!a||!a.message||"Error loading script"===a.message||p>=h&&!k?!1:!0;if(!b)return null;p++;d=d||{};var e=c;c=encodeURIComponent;b="/gen_204?atyp=i&ei="+c(google.kEI);google.kEXPI&&(b+="&jexpid="+c(google.kEXPI));b+="&srcpg="+c(google.sn)+"&jsr="+c(r.jsr)+"&bver="+c(r.bv)+("&jsel="+e);e=a.lineNumber;void 0!==e&&(b+="&line="+
e);var l=a.fileName;l&&(b+="&script="+c(l),e&&l===window.location.href&&(e=document.documentElement.outerHTML.split("\n")[e],b+="&cad="+c(e?e.substring(0,300):"No script found.")));for(var t in d)b+="&",b+=c(t),b+="=",b+=c(d[t]);b=b+"&emsg="+c(a.name+": "+a.message);b=b+"&jsst="+c(a.stack||"N/A");12288<=b.length&&(b=b.substr(0,12288));a=b;k||google.log(0,"",a);return a};window.onerror=function(a,b,d,k,c){q!==a&&(a=c instanceof Error?c:Error(a),void 0===d||"lineNumber"in a||(a.lineNumber=d),void 0===b||"fileName"in a||(a.fileName=b),google.ml(a,!1,void 0,!1,"SyntaxError"===a.name||"SyntaxError"===a.message.substring(0,11)?2:0));q=null;n&&p>=h&&(window.onerror=null)};})();</script></head><body bgcolor="#fff"><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){var src='/images/nav_logo229.png';var iesg=false;document.body.onload = function(){window.n && window.n();if (document.images){new Image().src=src;}
if (!iesg){document.f&&document.f.q.focus();document.gbqf&&document.gbqf.q.focus();}
}
})();</script><div id="mngb"><div id=gbar><nobr><b class=gb1>搜索</b> <a class=gb1 href="http://www.google.cn/imghp?hl=zh-CN&tab=wi">图片</a> <a class=gb1 href="http://ditu.google.cn/maps?hl=zh-CN&tab=wl">地图</a> <a class=gb1 href="https://play.google.com/?hl=zh-CN&tab=w8">Play</a> <a class=gb1 href="https://news.google.com/?tab=wn">新闻</a> <a class=gb1 href="https://drive.google.com/?tab=wo">云端硬盘</a> <a class=gb1 href="https://calendar.google.com/calendar?tab=wc">日历</a> <a class=gb1 href="https://translate.google.cn/?hl=zh-CN&tab=wT">翻译</a> <a class=gb1 style="text-decoration:none" href="https://www.google.cn/intl/zh-CN/about/products?tab=wh"><u>更多</u> &raquo;</a></nobr></div><div id=guser width=100%><nobr><span id=gbn class=gbi></span><span id=gbf class=gbf></span><span id=gbe></span><a href="http://www.google.cn/history/optout?hl=zh-CN" class=gb4>网络历史记录</a> | <a href="/preferences?hl=zh-CN" class=gb4>设置</a> | <a target=_top id=gb_70 href="https://accounts.google.com/ServiceLogin?hl=zh-CN&passive=true&continue=http://www.google.com.hk/&ec=GAZAAQ" class=gb4>登录</a></nobr></div><div class=gbh style=left:0></div><div class=gbh style=right:0></div></div><center><br clear="all" id="lgpd"><div id="lga"><img alt="Google" height="92" src="/images/branding/googlelogo/1x/googlelogo_white_background_color_272x92dp.png" style="padding:28px 0 14px" width="272" id="hplogo"><br><br></div><form action="/search" name="f"><table cellpadding="0" cellspacing="0"><tr valign="top"><td width="25%">&nbsp;</td><td align="center" nowrap=""><input name="ie" value="GB2312" type="hidden"><input value="zh-CN" name="hl" type="hidden"><input name="source" type="hidden" value="hp"><input name="biw" type="hidden"><input name="bih" type="hidden"><div class="ds" style="height:32px;margin:4px 0"><input class="lst" style="margin:0;padding:5px 8px 0 6px;vertical-align:top;color:#000" autocomplete="off" value="" title="Google 搜索" maxlength="2048" name="q" size="57"></div><br style="line-height:0"><span class="ds"><span class="lsbb"><input class="lsb" value="Google 搜索" name="btnG" type="submit"></span></span><span class="ds"><span class="lsbb"><input class="lsb" id="tsuid1" value="&nbsp;手气不错&nbsp;" name="btnI" type="submit"><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){var id='tsuid1';document.getElementById(id).onclick = function(){if (this.form.q.value){this.checked = 1;if (this.form.iflsig)this.form.iflsig.disabled = false;}
else top.location='/doodles/';};})();</script><input value="AJiK0e8AAAAAYm83-cz7iLpxWfOf4MXVhR41bEjV0J4S" name="iflsig" type="hidden"></span></span></td><td class="fl sblc" align="left" nowrap="" width="25%"><a href="/advanced_search?hl=zh-CN&amp;authuser=0">高级搜索</a></td></tr></table><input id="gbv" name="gbv" type="hidden" value="1"><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){
var a,b="1";if(document&&document.getElementById)if("undefined"!=typeof XMLHttpRequest)b="2";else if("undefined"!=typeof ActiveXObject){var c,d,e=["MSXML2.XMLHTTP.6.0","MSXML2.XMLHTTP.3.0","MSXML2.XMLHTTP","Microsoft.XMLHTTP"];for(c=0;d=e[c++];)try{new ActiveXObject(d),b="2"}catch(h){}}a=b;if("2"==a&&-1==location.search.indexOf("&gbv=2")){var f=google.gbvu,g=document.getElementById("gbv");g&&(g.value=a);f&&window.setTimeout(function(){location.href=f},0)};}).call(this);</script></form><div id="gac_scont"></div><div style="font-size:83%;min-height:3.5em"><br></div><span id="footer"><div style="font-size:10pt"><div style="margin:19px auto;text-align:center" id="WqQANb"><a href="/intl/zh-CN/ads/">加入营销计划</a><a href="/intl/zh-CN/about.html">Google 大全</a><a href="http://www.google.com.hk/setprefdomain?prefdom=US&amp;sig=K_5xGLaVXPJ9E6BAv_UFwF6QuXCsk%3D" id="fehl">Google.com</a></div></div><p style="font-size:8pt;color:#70757a">&copy; 2022 - <a href="/intl/zh-CN/policies/privacy/">隐私权</a> - <a href="/intl/zh-CN/policies/terms/">条款</a></p></span></center><script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){window.google.cdo={height:757,width:1440};(function(){
var a=window.innerWidth,b=window.innerHeight;if(!a||!b){var c=window.document,d="CSS1Compat"==c.compatMode?c.documentElement:c.body;a=d.clientWidth;b=d.clientHeight}a&&b&&(a!=google.cdo.width||b!=google.cdo.height)&&google.log("","","/client_204?&atyp=i&biw="+a+"&bih="+b+"&ei="+google.kEI);}).call(this);})();</script> <script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){google.xjs={ck:'',cs:'',excm:[]};})();</script> <script nonce="Gwv7h0bXsTxPbUPNEbU73Q">(function(){var u='/xjs/_/js/k\x3dxjs.hp.en.j3jfXaQeWvI.O/am\x3dAOAJAEACkA/d\x3d1/ed\x3d1/esmo\x3d1/rs\x3dACT90oG3H1KdGSCGx9JeWfQKPi1zGOmoVQ/m\x3dsb_he,d';
var d=this||self,e=function(a){return a};var g;var l=function(a,b){this.g=b===h?a:""};l.prototype.toString=function(){return this.g+""};var h={};
function n(){var a=u;google.lx=function(){p(a);google.lx=function(){}};google.bx||google.lx()}
function p(a){google.timers&&google.timers.load&&google.tick&&google.tick("load","xjsls");var b=document;var c="SCRIPT";"application/xhtml+xml"===b.contentType&&(c=c.toLowerCase());c=b.createElement(c);if(void 0===g){b=null;var k=d.trustedTypes;if(k&&k.createPolicy){try{b=k.createPolicy("goog#html",{createHTML:e,createScript:e,createScriptURL:e})}catch(q){d.console&&d.console.error(q.message)}g=b}else g=b}a=(b=g)?b.createScriptURL(a):a;a=new l(a,h);c.src=a instanceof l&&a.constructor===l?a.g:"type_error:TrustedResourceUrl";var f,m;(f=(a=null==(m=(f=(c.ownerDocument&&c.ownerDocument.defaultView||window).document).querySelector)?void 0:m.call(f,"script[nonce]"))?a.nonce||a.getAttribute("nonce")||"":"")&&c.setAttribute("nonce",f);document.body.appendChild(c);google.psa=!0};google.xjsu=u;setTimeout(function(){n()},0);})();function _DumpException(e){throw e;}
function _F_installCss(c){}
(function(){google.jl={attn:false,blt:'none',chnk:0,dw:false,dwu:true,emtn:0,end:0,ine:false,injs:'none',injt:0,lls:'default',pdt:0,rep:0,snet:true,strt:0,ubm:false,uwp:true};})();(function(){var pmc='{\x22d\x22:{},\x22sb_he\x22:{\x22agen\x22:true,\x22cgen\x22:true,\x22client\x22:\x22heirloom-hp\x22,\x22dh\x22:true,\x22dhqt\x22:true,\x22ds\x22:\x22\x22,\x22ffql\x22:\x22zh-CN\x22,\x22fl\x22:true,\x22host\x22:\x22google.com.hk\x22,\x22isbh\x22:28,\x22jsonp\x22:true,\x22msgs\x22:{\x22cibl\x22:\x22清除搜索\x22,\x22dym\x22:\x22您是不是要找\x22,\x22lcky\x22:\x22\\u0026nbsp;手气不错\\u0026nbsp;\x22,\x22lml\x22:\x22了解详情\x22,\x22oskt\x22:\x22输入工具\x22,\x22psrc\x22:\x22该搜索记录已从您的\\u003Ca href\x3d\\\x22/history\\\x22\\u003E网络历史记录\\u003C/a\\u003E中移除\x22,\x22psrl\x22:\x22删除\x22,\x22sbit\x22:\x22按图片搜索\x22,\x22srch\x22:\x22Google 搜索\x22},\x22ovr\x22:{},\x22pq\x22:\x22\x22,\x22refpd\x22:true,\x22rfs\x22:[],\x22sbas\x22:\x220 3px 8px 0 rgba(0,0,0,0.2),0 0 0 1px rgba(0,0,0,0.08)\x22,\x22sbpl\x22:16,\x22sbpr\x22:16,\x22scd\x22:10,\x22stok\x22:\x22OShBkbRb_Q7vaenlYCLCbBOyK0I\x22,\x22uhde\x22:false}}';google.pmc=JSON.parse(pmc);})();</script> </body></html>

@ -0,0 +1,43 @@
#include <cstdint>
#include <deque>
using std::uint32_t;
template<class T, bool minmax>
void running(void *array, uint32_t len, uint32_t w){
T* arr = static_cast<T*> (array);
std::deque<std::pair<T, uint32_t>> cache;
for(int i = 0; i < len; ++i){
if(!cache.empty() && cache.front().second == i-w) cache.pop_front();
if constexpr(minmax)
while(!cache.empty() && cache.back().first>arr[i]) cache.pop_back();
else
while(!cache.empty() && cache.back().first<arr[i]) cache.pop_back();
cache.push_back({arr[i], i});
arr[i] = cache.front().first;
}
}
template<class T>
inline void mm(void *array, uint32_t len, uint32_t w, bool mm){
mm? running<T, true>(array, len, w) : running<T, false>(array, len, w);
}
extern "C" {
#include <stdio.h>
int mmw(void *array, unsigned long long misc[]){
char _ty = misc[0];
uint32_t len = misc[1];
uint32_t w = misc[2];
bool minmax = misc[3]-0x10000;
switch(_ty){
case 'F': mm<double>(array, len, w, minmax); break;
case 'C': case 'G': mm<unsigned char>(array, len, w, minmax); break;
case 'H': mm<unsigned short>(array, len, w, minmax); break;
case 'D': case 'I': mm<unsigned int>(array, len, w, minmax); break;
case 'T': case 'J': mm<long long>(array, len, w, minmax); break;
case 'L': if(len == 0) break;
default: printf("nyi %c\n", _ty);
}
return 0;
}
}

@ -0,0 +1,14 @@
CREATE TABLE sale(Month INT, sales INT)
LOAD DATA INFILE "moving_avg.csv"
INTO TABLE sale
FIELDS TERMINATED BY "\t"
SELECT Month,avgs(3,sales)
FROM sale
ASSUMING ASC Month
INTO OUTFILE "moving_avg_output.csv"
FIELDS TERMINATED BY ";"
select Month, mins(2,sales) from sale assuming desc Month group by sales
into outfile "flatten.csv"

@ -0,0 +1,6 @@
Month,sales
1,100
4,140
5,130
3,140
2,120
1 Month sales
2 1 100
3 4 140
4 5 130
5 3 140
6 2 120

@ -0,0 +1,5 @@
#pragma once
template <class...T1, class ...T2>
struct lineage {
};

@ -0,0 +1,188 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|Win32">
<Configuration>Debug</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|Win32">
<Configuration>Release</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|x64">
<Configuration>Debug</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|x64">
<Configuration>Release</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<VCProjectVersion>16.0</VCProjectVersion>
<Keyword>Win32Proj</Keyword>
<ProjectGuid>{8081fdaa-4d13-4b7a-adb2-8224af7f1c81}</ProjectGuid>
<RootNamespace>Project1</RootNamespace>
<WindowsTargetPlatformVersion>10.0</WindowsTargetPlatformVersion>
<ProjectName>msc-plugin</ProjectName>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v143</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>false</EnableASAN>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v143</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>true</EnableASAN>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v143</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>false</EnableASAN>
<EnableFuzzer>false</EnableFuzzer>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>DynamicLibrary</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v143</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>true</EnableASAN>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="Shared">
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<TargetExt />
<TargetName>dll.so</TargetName>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<TargetExt />
<TargetName>dll.so</TargetName>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<TargetExt />
<TargetName>dll.so</TargetName>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<TargetExt />
<TargetName>dll.so</TargetName>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<LanguageStandard>stdcpplatest</LanguageStandard>
<ControlFlowGuard>Guard</ControlFlowGuard>
<EnableModules>true</EnableModules>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>DebugFull</GenerateDebugInformation>
<OutputFile>$(ProjectDir)\..\dll.so</OutputFile>
<AssemblyDebug>true</AssemblyDebug>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<LanguageStandard>stdcpplatest</LanguageStandard>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<GenerateDebugInformation>true</GenerateDebugInformation>
<OutputFile>$(ProjectDir)\..\dll.so</OutputFile>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_ALLOW_RTCc_IN_STL;_CRT_SECURE_NO_WARNINGS;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<LanguageStandard>stdcpplatest</LanguageStandard>
<ControlFlowGuard>Guard</ControlFlowGuard>
<EnableModules>true</EnableModules>
<DebugInformationFormat>ProgramDatabase</DebugInformationFormat>
<InlineFunctionExpansion>Disabled</InlineFunctionExpansion>
<OmitFramePointers>false</OmitFramePointers>
<SmallerTypeCheck>true</SmallerTypeCheck>
<SupportJustMyCode>false</SupportJustMyCode>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>DebugFull</GenerateDebugInformation>
<OutputFile>$(ProjectDir)\..\dll.so</OutputFile>
<AssemblyDebug>true</AssemblyDebug>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<LanguageStandard>stdcpplatest</LanguageStandard>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<GenerateDebugInformation>true</GenerateDebugInformation>
<OutputFile>$(ProjectDir)\..\dll.so</OutputFile>
</Link>
</ItemDefinitionGroup>
<ItemGroup>
<ClCompile Include="..\out.cpp" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\csv.h" />
<ClInclude Include="..\server\aggregations.h" />
<ClInclude Include="..\server\gc.hpp" />
<ClInclude Include="..\server\hasher.h" />
<ClInclude Include="..\server\libaquery.h" />
<ClInclude Include="..\server\table.h" />
<ClInclude Include="..\server\types.h" />
<ClInclude Include="..\server\utils.h" />
<ClInclude Include="..\server\vector_type.hpp" />
<ClInclude Include="..\server\winhelper.h" />
</ItemGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
<ImportGroup Label="ExtensionTargets">
</ImportGroup>
</Project>

@ -0,0 +1,93 @@
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003" ToolsVersion="4.0">
<PropertyGroup>
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<SchemaVersion>2.0</SchemaVersion>
<ProjectGuid>ccc243f5-663e-45b7-a6de-b2468c58b3a7</ProjectGuid>
<ProjectHome>.</ProjectHome>
<StartupFile>..\prompt.py</StartupFile>
<SearchPath>..\msvs-py</SearchPath>
<WorkingDirectory>..</WorkingDirectory>
<OutputPath>.</OutputPath>
<Name>msvs-py</Name>
<RootNamespace>msvs-py</RootNamespace>
<IsWindowsApplication>False</IsWindowsApplication>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)' == 'Debug' ">
<DebugSymbols>true</DebugSymbols>
<EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)' == 'Release' ">
<DebugSymbols>true</DebugSymbols>
<EnableUnmanagedDebugging>false</EnableUnmanagedDebugging>
</PropertyGroup>
<ItemGroup>
<Folder Include="..\aquery_parser\" />
<Folder Include="..\aquery_parser\__pycache__\" />
<Folder Include="..\engine\" />
<Folder Include="..\engine\__pycache__\" />
<Folder Include="parser\" />
<Folder Include="parser\__pycache__\" />
<Folder Include="__pycache__\" />
</ItemGroup>
<ItemGroup>
<Compile Include="..\aquery_parser\keywords.py" />
<Compile Include="..\aquery_parser\sql_parser.py" />
<Compile Include="..\aquery_parser\types.py" />
<Compile Include="..\aquery_parser\utils.py" />
<Compile Include="..\aquery_parser\windows.py" />
<Compile Include="..\aquery_parser\__init__.py" />
<Compile Include="..\engine\ast.py" />
<Compile Include="..\engine\ddl.py" />
<Compile Include="..\engine\expr.py" />
<Compile Include="..\engine\groupby.py" />
<Compile Include="..\engine\join.py" />
<Compile Include="..\engine\orderby.py" />
<Compile Include="..\engine\projection.py" />
<Compile Include="..\engine\scan.py" />
<Compile Include="..\engine\types.py" />
<Compile Include="..\engine\utils.py" />
<Compile Include="..\engine\__init__.py" />
<Compile Include="..\prompt.py" />
</ItemGroup>
<ItemGroup>
<Content Include="..\aquery_parser\__pycache__\keywords.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\keywords.cpython-39.pyc" />
<Content Include="..\aquery_parser\__pycache__\sql_parser.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\sql_parser.cpython-39.pyc" />
<Content Include="..\aquery_parser\__pycache__\types.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\types.cpython-39.pyc" />
<Content Include="..\aquery_parser\__pycache__\utils.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\utils.cpython-39.pyc" />
<Content Include="..\aquery_parser\__pycache__\windows.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\windows.cpython-39.pyc" />
<Content Include="..\aquery_parser\__pycache__\__init__.cpython-310.pyc" />
<Content Include="..\aquery_parser\__pycache__\__init__.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\ast.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\ast.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\ddl.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\ddl.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\expr.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\expr.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\groupby.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\join.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\join.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\orderby.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\projection.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\projection.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\scan.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\types.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\utils.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\utils.cpython-39.pyc" />
<Content Include="..\engine\__pycache__\__init__.cpython-310.pyc" />
<Content Include="..\engine\__pycache__\__init__.cpython-39.pyc" />
</ItemGroup>
<Import Project="$(MSBuildExtensionsPath32)\Microsoft\VisualStudio\v$(VisualStudioVersion)\Python Tools\Microsoft.PythonTools.targets" />
<!-- Uncomment the CoreCompile target to enable the Build command in
Visual Studio and specify your pre- and post-build commands in
the BeforeBuild and AfterBuild targets below. -->
<!--<Target Name="CoreCompile" />-->
<Target Name="BeforeBuild">
</Target>
<Target Name="AfterBuild">
</Target>
</Project>

@ -0,0 +1,99 @@
VendorID,tpep_pickup_datetime,tpep_dropoff_datetime,passenger_count,trip_distance,RatecodeID,store_and_fwd_flag,PULocationID,DOLocationID,payment_type,fare_amount,extra,mta_tax,tip_amount,tolls_amount,improvement_surcharge,total_amount
1,1/9/2017 11:13,1/9/2017 11:25,1,3.3,1,N,263,161,1,12.5,0,0.5,2,0,0.3,15.3
1,1/9/2017 11:32,1/9/2017 11:36,1,0.9,1,N,186,234,1,5,0,0.5,1.45,0,0.3,7.25
1,1/9/2017 11:38,1/9/2017 11:42,1,1.1,1,N,164,161,1,5.5,0,0.5,1,0,0.3,7.3
1,1/9/2017 11:52,1/9/2017 11:57,1,1.1,1,N,236,75,1,6,0,0.5,1.7,0,0.3,8.5
2,1/1/2017 0:00,1/1/2017 0:00,1,0.02,2,N,249,234,2,52,0,0.5,0,0,0.3,52.8
1,1/1/2017 0:00,1/1/2017 0:03,1,0.5,1,N,48,48,2,4,0.5,0.5,0,0,0.3,5.3
2,1/1/2017 0:00,1/1/2017 0:39,4,7.75,1,N,186,36,1,22,0.5,0.5,4.66,0,0.3,27.96
1,1/1/2017 0:00,1/1/2017 0:06,1,0.8,1,N,162,161,1,6,0.5,0.5,1.45,0,0.3,8.75
1,1/1/2017 0:00,1/1/2017 0:08,2,0.9,1,N,48,50,1,7,0.5,0.5,0,0,0.3,8.3
2,1/1/2017 0:00,1/1/2017 0:05,5,1.76,1,N,140,74,2,7,0.5,0.5,0,0,0.3,8.3
2,1/1/2017 0:00,1/1/2017 0:15,1,8.47,1,N,138,262,1,24,0.5,0.5,7.71,5.54,0.3,38.55
1,1/1/2017 0:00,1/1/2017 0:11,2,2.4,1,N,142,236,2,10.5,0.5,0.5,0,0,0.3,11.8
1,1/1/2017 0:00,1/1/2017 0:23,2,12.6,5,N,161,265,1,60,0,0,10,0,0.3,70.3
1,1/1/2017 0:00,1/1/2017 0:08,1,0.9,1,N,234,186,1,7,0.5,0.5,2.05,0,0.3,10.35
2,1/1/2017 0:00,1/1/2017 0:09,4,2.43,1,N,141,107,1,9.5,0.5,0.5,2.7,0,0.3,13.5
2,1/1/2017 0:00,1/1/2017 0:16,2,2.6,1,N,79,163,1,12.5,0.5,0.5,2.76,0,0.3,16.56
2,1/1/2017 0:00,1/1/2017 0:18,5,4.25,1,N,148,36,2,16.5,0.5,0.5,0,0,0.3,17.8
2,1/1/2017 0:00,1/1/2017 0:07,1,0.65,1,N,48,68,1,6.5,0.5,0.5,1.7,0,0.3,9.5
2,1/1/2017 0:00,1/1/2017 0:34,1,3.42,1,N,230,148,1,22.5,0.5,0.5,0,0,0.3,23.8
1,1/1/2017 0:00,1/1/2017 0:24,1,6.6,1,N,186,232,2,23,0.5,0.5,0,0,0.3,24.3
1,1/1/2017 0:00,1/1/2017 0:02,1,0.5,1,N,36,37,2,4,0.5,0.5,0,0,0.3,5.3
1,1/1/2017 0:00,1/1/2017 0:08,1,1.2,1,N,41,74,1,7.5,0.5,0.5,1.75,0,0.3,10.55
1,1/1/2017 0:00,1/1/2017 0:12,1,1.7,1,N,125,45,2,9.5,0.5,0.5,0,0,0.3,10.8
1,1/1/2017 0:00,1/1/2017 0:09,1,5.3,1,N,138,192,2,16,0.5,0.5,0,0,0.3,17.3
1,1/1/2017 0:00,1/1/2017 0:01,2,0.2,1,N,143,143,2,3,0.5,0.5,0,0,0.3,4.3
2,1/1/2017 0:00,1/1/2017 0:15,1,4.68,1,N,140,223,1,16.5,0.5,0.5,4.45,0,0.3,22.25
2,1/1/2017 0:00,1/1/2017 0:12,1,1.9,1,N,68,48,1,10,0.5,0.5,0,0,0.3,11.3
2,1/1/2017 0:00,1/1/2017 0:24,2,2.1,1,N,239,48,2,16,0.5,0.5,0,0,0.3,17.3
2,1/1/2017 0:00,1/1/2017 0:13,1,2.75,1,N,233,114,1,11.5,0.5,0.5,2,0,0.3,14.8
2,1/1/2017 0:00,1/1/2017 0:00,1,0,5,N,14,14,1,32.8,0,0.5,0,0,0.3,33.6
2,1/1/2017 0:00,1/1/2017 0:10,5,1.53,1,N,239,48,2,6,0.5,0.5,0,0,0.3,7.3
2,1/1/2017 0:00,1/1/2017 0:16,1,1.26,1,N,164,161,1,10.5,0.5,0.5,2.36,0,0.3,14.16
1,1/1/2017 0:00,1/1/2017 0:07,1,1,1,N,256,255,2,6.5,0.5,0.5,0,0,0.3,7.8
2,1/1/2017 0:00,1/1/2017 0:19,2,13.97,1,N,138,181,1,37.5,0.5,0.5,11.64,0,0.3,50.44
1,1/1/2017 0:00,1/1/2017 0:04,1,0.7,1,N,237,43,2,5.5,0.5,0.5,0,0,0.3,6.8
1,1/1/2017 0:00,1/1/2017 0:04,1,0.4,1,N,158,68,2,4.5,0.5,0.5,0,0,0.3,5.8
2,1/1/2017 0:00,1/1/2017 0:20,4,1.85,1,N,229,68,2,13,0.5,0.5,0,0,0.3,14.3
2,1/1/2017 0:00,1/1/2017 0:08,1,4.29,1,N,132,10,2,13.5,0.5,0.5,0,0,0.3,14.8
1,1/1/2017 0:00,1/1/2017 0:09,1,4.4,1,N,170,87,1,14.5,0.5,0.5,3.15,0,0.3,18.95
1,1/1/2017 0:00,1/1/2017 0:01,1,0.1,1,N,243,243,2,3,0.5,0.5,0,0,0.3,4.3
2,1/1/2017 0:00,1/1/2017 0:03,1,0.24,1,N,161,161,1,4,0.5,0.5,1.06,0,0.3,6.36
2,1/1/2017 0:00,1/1/2017 0:03,1,0.92,1,N,50,246,1,4.5,0.5,0.5,1.45,0,0.3,7.25
1,1/1/2017 0:00,1/1/2017 0:18,1,1.1,1,N,237,161,2,12,0.5,0.5,0,0,0.3,13.3
1,1/1/2017 0:00,1/1/2017 0:12,1,6.9,1,N,138,80,2,20,0.5,0.5,0,0,0.3,21.3
2,1/1/2017 0:00,1/1/2017 0:04,5,0.6,1,N,238,238,2,5,0.5,0.5,0,0,0.3,6.3
2,1/1/2017 0:00,1/1/2017 0:03,3,0.56,1,N,74,75,2,4.5,0.5,0.5,0,0,0.3,5.8
1,1/1/2017 0:00,1/1/2017 0:06,1,0.9,1,N,113,249,1,6,0.5,0.5,2.15,0,0.3,9.45
2,1/1/2017 0:00,1/1/2017 0:35,1,2.88,1,N,163,113,1,21.5,0.5,0.5,4.56,0,0.3,27.36
1,1/1/2017 0:00,1/1/2017 0:24,2,3.9,1,N,129,179,2,19.5,0.5,0.5,0,0,0.3,20.8
2,1/1/2017 0:00,1/1/2017 0:09,1,1.7,1,N,41,75,1,8.5,0.5,0.5,2.45,0,0.3,12.25
1,1/1/2017 0:00,1/1/2017 0:06,3,0.7,1,N,229,161,1,6,0.5,0.5,1.45,0,0.3,8.75
1,1/1/2017 0:00,1/1/2017 0:03,4,0.9,1,N,236,236,2,5,0.5,0.5,0,0,0.3,6.3
2,1/1/2017 0:00,1/1/2017 0:06,1,0.79,1,N,163,237,1,6,0.5,0.5,1.82,0,0.3,9.12
2,1/1/2017 0:00,1/1/2017 0:08,1,1.25,1,N,234,162,2,7.5,0.5,0.5,0,0,0.3,8.8
2,1/1/2017 0:00,1/1/2017 0:08,5,2.05,1,N,113,233,1,8.5,0.5,0.5,1,0,0.3,10.8
2,1/1/2017 0:00,1/1/2017 0:12,1,2.56,1,N,113,229,1,10.5,0.5,0.5,8.2,0,0.3,20
2,1/1/2017 0:00,1/1/2017 0:11,1,0.1,1,N,264,264,2,8,0.5,0.5,0,0,0.3,9.3
2,1/1/2017 0:00,1/1/2017 0:08,1,1.78,1,N,113,231,1,8,0.5,0.5,1.86,0,0.3,11.16
1,1/1/2017 0:00,1/1/2017 0:31,2,10.5,1,N,138,230,2,35.5,0.5,0.5,0,5.54,0.3,42.34
1,1/1/2017 0:00,1/1/2017 0:05,1,0.9,1,N,148,4,1,6,0.5,0.5,1.45,0,0.3,8.75
2,1/1/2017 0:00,1/1/2017 0:09,1,1.97,1,N,74,168,2,9,0.5,0.5,0,0,0.3,10.3
2,1/1/2017 0:00,1/1/2017 0:42,6,20.99,2,N,132,249,2,52,0,0.5,0,0,0.3,52.8
1,1/1/2017 0:00,1/1/2017 0:02,1,0,1,N,249,249,3,3,0.5,0.5,0,0,0.3,4.3
2,1/1/2017 0:00,1/1/2017 0:06,2,0.66,1,N,79,79,1,5.5,0.5,0.5,1.36,0,0.3,8.16
1,1/1/2017 0:00,1/1/2017 0:12,1,3.6,1,N,263,7,2,13,0.5,0.5,0,0,0.3,14.3
2,1/1/2017 0:00,1/1/2017 0:12,6,4.02,1,N,41,235,1,14,0.5,0.5,3.06,0,0.3,18.36
2,1/1/2017 0:00,1/1/2017 0:07,2,2.66,1,N,148,233,1,10,0.5,0.5,2.26,0,0.3,13.56
1,1/1/2017 0:00,1/1/2017 0:19,1,11.4,1,N,138,181,2,31.5,0.5,0.5,0,0,0.3,32.8
2,1/1/2017 0:00,1/1/2017 0:15,1,6.59,1,N,143,42,2,20,0.5,0.5,0,0,0.3,21.3
2,1/1/2017 0:00,1/1/2017 0:20,1,2.82,1,N,144,170,2,14,0.5,0.5,0,0,0.3,15.3
2,1/1/2017 0:00,1/1/2017 0:07,1,1.09,1,N,129,129,2,7,0.5,0.5,0,0,0.3,8.3
2,1/1/2017 0:00,1/1/2017 0:01,2,0.41,1,N,239,238,1,3.5,0.5,0.5,0.72,0,0.3,5.52
1,1/1/2017 0:00,1/1/2017 0:13,2,2.1,1,Y,45,137,3,10.5,0.5,0.5,0,0,0.3,11.8
1,1/1/2017 0:00,1/1/2017 0:03,1,2.2,1,N,137,137,2,3.5,0.5,0.5,0,0,0.3,4.8
1,1/1/2017 0:00,1/1/2017 0:43,1,20.3,1,N,132,61,1,58,0.5,0.5,5,0,0.3,64.3
1,1/1/2017 0:00,1/1/2017 0:12,1,2.8,1,N,239,229,1,11.5,0.5,0.5,1,0,0.3,13.8
2,1/1/2017 0:00,1/1/2017 0:08,3,1.96,1,N,249,88,2,8.5,0.5,0.5,0,0,0.3,9.8
1,1/1/2017 0:00,1/1/2017 0:01,1,0,1,N,138,138,3,2.5,0.5,0.5,0,0,0.3,3.8
1,1/1/2017 0:00,1/1/2017 0:11,1,4.8,1,N,162,45,1,15.5,0.5,0.5,3.35,0,0.3,20.15
2,1/1/2017 0:00,1/1/2017 0:13,2,4.69,1,N,262,90,2,16,0.5,0.5,0,0,0.3,17.3
2,1/1/2017 0:00,1/1/2017 0:05,5,0.6,1,N,170,233,1,4.5,0.5,0.5,1.16,0,0.3,6.96
2,1/1/2017 0:00,1/1/2017 0:04,5,0.93,1,N,239,142,1,4.5,0.5,0.5,15,0,0.3,20.8
2,1/1/2017 0:00,1/1/2017 0:14,2,8.6,1,N,148,7,2,25,0.5,0.5,0,0,0.3,26.3
2,1/1/2017 0:00,1/1/2017 0:07,5,1.17,1,N,158,90,1,6.5,0.5,0.5,2.34,0,0.3,10.14
2,1/1/2017 0:00,1/1/2017 0:14,2,4.41,1,N,233,74,1,14.5,0.5,0.5,2,0,0.3,17.8
1,1/1/2017 0:00,1/1/2017 0:05,1,0.9,1,N,106,106,2,6,0.5,0.5,0,0,0.3,7.3
1,1/1/2017 0:00,1/1/2017 0:17,2,3,1,N,231,107,1,14,0.5,0.5,2,0,0.3,17.3
1,1/1/2017 0:00,1/1/2017 0:10,1,1.2,1,N,113,137,2,8,0.5,0.5,0,0,0.3,9.3
1,1/1/2017 0:00,1/1/2017 0:14,1,1.2,1,N,246,186,1,9.5,0.5,0.5,3.2,0,0.3,14
1,1/1/2017 0:00,1/1/2017 0:21,1,5.9,1,N,162,129,1,20.5,0.5,0.5,4.35,0,0.3,26.15
1,1/1/2017 0:00,1/1/2017 0:05,2,0.5,1,N,264,264,1,5,0.5,0.5,1.25,0,0.3,7.55
2,1/1/2017 0:00,1/1/2017 0:23,1,6.58,1,N,186,152,1,23.5,0.5,0.5,7.44,0,0.3,32.24
1,1/1/2017 0:01,1/1/2017 0:17,1,3.7,1,N,50,79,1,14,0.5,0.5,3.05,0,0.3,18.35
2,1/1/2017 0:01,1/1/2017 0:23,1,5.91,1,N,125,37,1,21,0.5,0.5,4.46,0,0.3,26.76
1,1/1/2017 0:01,1/1/2017 0:09,1,1.2,1,N,152,42,2,7.5,0.5,0.5,0,0,0.3,8.8
1,1/1/2017 0:01,1/1/2017 0:05,2,1.2,1,N,137,162,1,6,0.5,0.5,1,0,0.3,8.3
1,1/1/2017 0:01,1/1/2017 0:16,2,7.4,1,N,132,130,2,22.5,0.5,0.5,0,0,0.3,23.8
2,1/1/2017 0:01,1/1/2017 0:08,6,0.87,1,N,107,107,1,6,0.5,0.5,5,0,0.3,12.3
1 VendorID tpep_pickup_datetime tpep_dropoff_datetime passenger_count trip_distance RatecodeID store_and_fwd_flag PULocationID DOLocationID payment_type fare_amount extra mta_tax tip_amount tolls_amount improvement_surcharge total_amount
2 1 1/9/2017 11:13 1/9/2017 11:25 1 3.3 1 N 263 161 1 12.5 0 0.5 2 0 0.3 15.3
3 1 1/9/2017 11:32 1/9/2017 11:36 1 0.9 1 N 186 234 1 5 0 0.5 1.45 0 0.3 7.25
4 1 1/9/2017 11:38 1/9/2017 11:42 1 1.1 1 N 164 161 1 5.5 0 0.5 1 0 0.3 7.3
5 1 1/9/2017 11:52 1/9/2017 11:57 1 1.1 1 N 236 75 1 6 0 0.5 1.7 0 0.3 8.5
6 2 1/1/2017 0:00 1/1/2017 0:00 1 0.02 2 N 249 234 2 52 0 0.5 0 0 0.3 52.8
7 1 1/1/2017 0:00 1/1/2017 0:03 1 0.5 1 N 48 48 2 4 0.5 0.5 0 0 0.3 5.3
8 2 1/1/2017 0:00 1/1/2017 0:39 4 7.75 1 N 186 36 1 22 0.5 0.5 4.66 0 0.3 27.96
9 1 1/1/2017 0:00 1/1/2017 0:06 1 0.8 1 N 162 161 1 6 0.5 0.5 1.45 0 0.3 8.75
10 1 1/1/2017 0:00 1/1/2017 0:08 2 0.9 1 N 48 50 1 7 0.5 0.5 0 0 0.3 8.3
11 2 1/1/2017 0:00 1/1/2017 0:05 5 1.76 1 N 140 74 2 7 0.5 0.5 0 0 0.3 8.3
12 2 1/1/2017 0:00 1/1/2017 0:15 1 8.47 1 N 138 262 1 24 0.5 0.5 7.71 5.54 0.3 38.55
13 1 1/1/2017 0:00 1/1/2017 0:11 2 2.4 1 N 142 236 2 10.5 0.5 0.5 0 0 0.3 11.8
14 1 1/1/2017 0:00 1/1/2017 0:23 2 12.6 5 N 161 265 1 60 0 0 10 0 0.3 70.3
15 1 1/1/2017 0:00 1/1/2017 0:08 1 0.9 1 N 234 186 1 7 0.5 0.5 2.05 0 0.3 10.35
16 2 1/1/2017 0:00 1/1/2017 0:09 4 2.43 1 N 141 107 1 9.5 0.5 0.5 2.7 0 0.3 13.5
17 2 1/1/2017 0:00 1/1/2017 0:16 2 2.6 1 N 79 163 1 12.5 0.5 0.5 2.76 0 0.3 16.56
18 2 1/1/2017 0:00 1/1/2017 0:18 5 4.25 1 N 148 36 2 16.5 0.5 0.5 0 0 0.3 17.8
19 2 1/1/2017 0:00 1/1/2017 0:07 1 0.65 1 N 48 68 1 6.5 0.5 0.5 1.7 0 0.3 9.5
20 2 1/1/2017 0:00 1/1/2017 0:34 1 3.42 1 N 230 148 1 22.5 0.5 0.5 0 0 0.3 23.8
21 1 1/1/2017 0:00 1/1/2017 0:24 1 6.6 1 N 186 232 2 23 0.5 0.5 0 0 0.3 24.3
22 1 1/1/2017 0:00 1/1/2017 0:02 1 0.5 1 N 36 37 2 4 0.5 0.5 0 0 0.3 5.3
23 1 1/1/2017 0:00 1/1/2017 0:08 1 1.2 1 N 41 74 1 7.5 0.5 0.5 1.75 0 0.3 10.55
24 1 1/1/2017 0:00 1/1/2017 0:12 1 1.7 1 N 125 45 2 9.5 0.5 0.5 0 0 0.3 10.8
25 1 1/1/2017 0:00 1/1/2017 0:09 1 5.3 1 N 138 192 2 16 0.5 0.5 0 0 0.3 17.3
26 1 1/1/2017 0:00 1/1/2017 0:01 2 0.2 1 N 143 143 2 3 0.5 0.5 0 0 0.3 4.3
27 2 1/1/2017 0:00 1/1/2017 0:15 1 4.68 1 N 140 223 1 16.5 0.5 0.5 4.45 0 0.3 22.25
28 2 1/1/2017 0:00 1/1/2017 0:12 1 1.9 1 N 68 48 1 10 0.5 0.5 0 0 0.3 11.3
29 2 1/1/2017 0:00 1/1/2017 0:24 2 2.1 1 N 239 48 2 16 0.5 0.5 0 0 0.3 17.3
30 2 1/1/2017 0:00 1/1/2017 0:13 1 2.75 1 N 233 114 1 11.5 0.5 0.5 2 0 0.3 14.8
31 2 1/1/2017 0:00 1/1/2017 0:00 1 0 5 N 14 14 1 32.8 0 0.5 0 0 0.3 33.6
32 2 1/1/2017 0:00 1/1/2017 0:10 5 1.53 1 N 239 48 2 6 0.5 0.5 0 0 0.3 7.3
33 2 1/1/2017 0:00 1/1/2017 0:16 1 1.26 1 N 164 161 1 10.5 0.5 0.5 2.36 0 0.3 14.16
34 1 1/1/2017 0:00 1/1/2017 0:07 1 1 1 N 256 255 2 6.5 0.5 0.5 0 0 0.3 7.8
35 2 1/1/2017 0:00 1/1/2017 0:19 2 13.97 1 N 138 181 1 37.5 0.5 0.5 11.64 0 0.3 50.44
36 1 1/1/2017 0:00 1/1/2017 0:04 1 0.7 1 N 237 43 2 5.5 0.5 0.5 0 0 0.3 6.8
37 1 1/1/2017 0:00 1/1/2017 0:04 1 0.4 1 N 158 68 2 4.5 0.5 0.5 0 0 0.3 5.8
38 2 1/1/2017 0:00 1/1/2017 0:20 4 1.85 1 N 229 68 2 13 0.5 0.5 0 0 0.3 14.3
39 2 1/1/2017 0:00 1/1/2017 0:08 1 4.29 1 N 132 10 2 13.5 0.5 0.5 0 0 0.3 14.8
40 1 1/1/2017 0:00 1/1/2017 0:09 1 4.4 1 N 170 87 1 14.5 0.5 0.5 3.15 0 0.3 18.95
41 1 1/1/2017 0:00 1/1/2017 0:01 1 0.1 1 N 243 243 2 3 0.5 0.5 0 0 0.3 4.3
42 2 1/1/2017 0:00 1/1/2017 0:03 1 0.24 1 N 161 161 1 4 0.5 0.5 1.06 0 0.3 6.36
43 2 1/1/2017 0:00 1/1/2017 0:03 1 0.92 1 N 50 246 1 4.5 0.5 0.5 1.45 0 0.3 7.25
44 1 1/1/2017 0:00 1/1/2017 0:18 1 1.1 1 N 237 161 2 12 0.5 0.5 0 0 0.3 13.3
45 1 1/1/2017 0:00 1/1/2017 0:12 1 6.9 1 N 138 80 2 20 0.5 0.5 0 0 0.3 21.3
46 2 1/1/2017 0:00 1/1/2017 0:04 5 0.6 1 N 238 238 2 5 0.5 0.5 0 0 0.3 6.3
47 2 1/1/2017 0:00 1/1/2017 0:03 3 0.56 1 N 74 75 2 4.5 0.5 0.5 0 0 0.3 5.8
48 1 1/1/2017 0:00 1/1/2017 0:06 1 0.9 1 N 113 249 1 6 0.5 0.5 2.15 0 0.3 9.45
49 2 1/1/2017 0:00 1/1/2017 0:35 1 2.88 1 N 163 113 1 21.5 0.5 0.5 4.56 0 0.3 27.36
50 1 1/1/2017 0:00 1/1/2017 0:24 2 3.9 1 N 129 179 2 19.5 0.5 0.5 0 0 0.3 20.8
51 2 1/1/2017 0:00 1/1/2017 0:09 1 1.7 1 N 41 75 1 8.5 0.5 0.5 2.45 0 0.3 12.25
52 1 1/1/2017 0:00 1/1/2017 0:06 3 0.7 1 N 229 161 1 6 0.5 0.5 1.45 0 0.3 8.75
53 1 1/1/2017 0:00 1/1/2017 0:03 4 0.9 1 N 236 236 2 5 0.5 0.5 0 0 0.3 6.3
54 2 1/1/2017 0:00 1/1/2017 0:06 1 0.79 1 N 163 237 1 6 0.5 0.5 1.82 0 0.3 9.12
55 2 1/1/2017 0:00 1/1/2017 0:08 1 1.25 1 N 234 162 2 7.5 0.5 0.5 0 0 0.3 8.8
56 2 1/1/2017 0:00 1/1/2017 0:08 5 2.05 1 N 113 233 1 8.5 0.5 0.5 1 0 0.3 10.8
57 2 1/1/2017 0:00 1/1/2017 0:12 1 2.56 1 N 113 229 1 10.5 0.5 0.5 8.2 0 0.3 20
58 2 1/1/2017 0:00 1/1/2017 0:11 1 0.1 1 N 264 264 2 8 0.5 0.5 0 0 0.3 9.3
59 2 1/1/2017 0:00 1/1/2017 0:08 1 1.78 1 N 113 231 1 8 0.5 0.5 1.86 0 0.3 11.16
60 1 1/1/2017 0:00 1/1/2017 0:31 2 10.5 1 N 138 230 2 35.5 0.5 0.5 0 5.54 0.3 42.34
61 1 1/1/2017 0:00 1/1/2017 0:05 1 0.9 1 N 148 4 1 6 0.5 0.5 1.45 0 0.3 8.75
62 2 1/1/2017 0:00 1/1/2017 0:09 1 1.97 1 N 74 168 2 9 0.5 0.5 0 0 0.3 10.3
63 2 1/1/2017 0:00 1/1/2017 0:42 6 20.99 2 N 132 249 2 52 0 0.5 0 0 0.3 52.8
64 1 1/1/2017 0:00 1/1/2017 0:02 1 0 1 N 249 249 3 3 0.5 0.5 0 0 0.3 4.3
65 2 1/1/2017 0:00 1/1/2017 0:06 2 0.66 1 N 79 79 1 5.5 0.5 0.5 1.36 0 0.3 8.16
66 1 1/1/2017 0:00 1/1/2017 0:12 1 3.6 1 N 263 7 2 13 0.5 0.5 0 0 0.3 14.3
67 2 1/1/2017 0:00 1/1/2017 0:12 6 4.02 1 N 41 235 1 14 0.5 0.5 3.06 0 0.3 18.36
68 2 1/1/2017 0:00 1/1/2017 0:07 2 2.66 1 N 148 233 1 10 0.5 0.5 2.26 0 0.3 13.56
69 1 1/1/2017 0:00 1/1/2017 0:19 1 11.4 1 N 138 181 2 31.5 0.5 0.5 0 0 0.3 32.8
70 2 1/1/2017 0:00 1/1/2017 0:15 1 6.59 1 N 143 42 2 20 0.5 0.5 0 0 0.3 21.3
71 2 1/1/2017 0:00 1/1/2017 0:20 1 2.82 1 N 144 170 2 14 0.5 0.5 0 0 0.3 15.3
72 2 1/1/2017 0:00 1/1/2017 0:07 1 1.09 1 N 129 129 2 7 0.5 0.5 0 0 0.3 8.3
73 2 1/1/2017 0:00 1/1/2017 0:01 2 0.41 1 N 239 238 1 3.5 0.5 0.5 0.72 0 0.3 5.52
74 1 1/1/2017 0:00 1/1/2017 0:13 2 2.1 1 Y 45 137 3 10.5 0.5 0.5 0 0 0.3 11.8
75 1 1/1/2017 0:00 1/1/2017 0:03 1 2.2 1 N 137 137 2 3.5 0.5 0.5 0 0 0.3 4.8
76 1 1/1/2017 0:00 1/1/2017 0:43 1 20.3 1 N 132 61 1 58 0.5 0.5 5 0 0.3 64.3
77 1 1/1/2017 0:00 1/1/2017 0:12 1 2.8 1 N 239 229 1 11.5 0.5 0.5 1 0 0.3 13.8
78 2 1/1/2017 0:00 1/1/2017 0:08 3 1.96 1 N 249 88 2 8.5 0.5 0.5 0 0 0.3 9.8
79 1 1/1/2017 0:00 1/1/2017 0:01 1 0 1 N 138 138 3 2.5 0.5 0.5 0 0 0.3 3.8
80 1 1/1/2017 0:00 1/1/2017 0:11 1 4.8 1 N 162 45 1 15.5 0.5 0.5 3.35 0 0.3 20.15
81 2 1/1/2017 0:00 1/1/2017 0:13 2 4.69 1 N 262 90 2 16 0.5 0.5 0 0 0.3 17.3
82 2 1/1/2017 0:00 1/1/2017 0:05 5 0.6 1 N 170 233 1 4.5 0.5 0.5 1.16 0 0.3 6.96
83 2 1/1/2017 0:00 1/1/2017 0:04 5 0.93 1 N 239 142 1 4.5 0.5 0.5 15 0 0.3 20.8
84 2 1/1/2017 0:00 1/1/2017 0:14 2 8.6 1 N 148 7 2 25 0.5 0.5 0 0 0.3 26.3
85 2 1/1/2017 0:00 1/1/2017 0:07 5 1.17 1 N 158 90 1 6.5 0.5 0.5 2.34 0 0.3 10.14
86 2 1/1/2017 0:00 1/1/2017 0:14 2 4.41 1 N 233 74 1 14.5 0.5 0.5 2 0 0.3 17.8
87 1 1/1/2017 0:00 1/1/2017 0:05 1 0.9 1 N 106 106 2 6 0.5 0.5 0 0 0.3 7.3
88 1 1/1/2017 0:00 1/1/2017 0:17 2 3 1 N 231 107 1 14 0.5 0.5 2 0 0.3 17.3
89 1 1/1/2017 0:00 1/1/2017 0:10 1 1.2 1 N 113 137 2 8 0.5 0.5 0 0 0.3 9.3
90 1 1/1/2017 0:00 1/1/2017 0:14 1 1.2 1 N 246 186 1 9.5 0.5 0.5 3.2 0 0.3 14
91 1 1/1/2017 0:00 1/1/2017 0:21 1 5.9 1 N 162 129 1 20.5 0.5 0.5 4.35 0 0.3 26.15
92 1 1/1/2017 0:00 1/1/2017 0:05 2 0.5 1 N 264 264 1 5 0.5 0.5 1.25 0 0.3 7.55
93 2 1/1/2017 0:00 1/1/2017 0:23 1 6.58 1 N 186 152 1 23.5 0.5 0.5 7.44 0 0.3 32.24
94 1 1/1/2017 0:01 1/1/2017 0:17 1 3.7 1 N 50 79 1 14 0.5 0.5 3.05 0 0.3 18.35
95 2 1/1/2017 0:01 1/1/2017 0:23 1 5.91 1 N 125 37 1 21 0.5 0.5 4.46 0 0.3 26.76
96 1 1/1/2017 0:01 1/1/2017 0:09 1 1.2 1 N 152 42 2 7.5 0.5 0.5 0 0 0.3 8.8
97 1 1/1/2017 0:01 1/1/2017 0:05 2 1.2 1 N 137 162 1 6 0.5 0.5 1 0 0.3 8.3
98 1 1/1/2017 0:01 1/1/2017 0:16 2 7.4 1 N 132 130 2 22.5 0.5 0.5 0 0 0.3 23.8
99 2 1/1/2017 0:01 1/1/2017 0:08 6 0.87 1 N 107 107 1 6 0.5 0.5 5 0 0.3 12.3

@ -0,0 +1,57 @@
#include "./server/aggregations.h"
#include <unordered_map>
#include "csv.h"
#include "./server/hasher.h"
#include "./server/libaquery.h"
extern "C" int __DLLEXPORT__ dllmain(Context* cxt) {
using namespace std;
using namespace types;
auto test = new TableInfo<int,int,int,int>("test", 4);
cxt->tables.insert({"test", test});
auto& test_a = *(ColRef<int> *)(&test->colrefs[0]);
auto& test_b = *(ColRef<int> *)(&test->colrefs[1]);
auto& test_c = *(ColRef<int> *)(&test->colrefs[2]);
auto& test_d = *(ColRef<int> *)(&test->colrefs[3]);
test_a.init("a");
test_b.init("b");
test_c.init("c");
test_d.init("d");
io::CSVReader<4> csv_reader_6pX2fy("test.csv");
csv_reader_6pX2fy.read_header(io::ignore_extra_column, "a","b","c","d");
int tmp_1E7DxvaO;
int tmp_5JwTTT4O;
int tmp_3gbplDAu;
int tmp_XK4BgA6z;
while(csv_reader_6pX2fy.read_row(tmp_1E7DxvaO,tmp_5JwTTT4O,tmp_3gbplDAu,tmp_XK4BgA6z)) {
test_a.emplace_back(tmp_1E7DxvaO);
test_b.emplace_back(tmp_5JwTTT4O);
test_c.emplace_back(tmp_3gbplDAu);
test_d.emplace_back(tmp_XK4BgA6z);
}
typedef record<decltype(test_a[0]),decltype(test_b[0]),decltype(test_d[0])> record_type61iBrX3;
unordered_map<record_type61iBrX3, vector_type<uint32_t>, transTypes<record_type61iBrX3, hasher>> g7sUysrP;
for (uint32_t i2I = 0; i2I < test_a.size; ++i2I){
g7sUysrP[forward_as_tuple(test_a[i2I],test_b[i2I],test_d[i2I])].emplace_back(i2I);
}
auto out_1SHu = new TableInfo<decays<decltype(sum(test_c))>,value_type<decays<decltype(test_b)>>,value_type<decays<decltype(test_d)>>>("out_1SHu", 3);
cxt->tables.insert({"out_1SHu", out_1SHu});
auto& out_1SHu_sumtestc = *(ColRef<decays<decltype(sum(test_c))>> *)(&out_1SHu->colrefs[0]);
auto& out_1SHu_b = *(ColRef<value_type<decays<decltype(test_b)>>> *)(&out_1SHu->colrefs[1]);
auto& out_1SHu_d = *(ColRef<value_type<decays<decltype(test_d)>>> *)(&out_1SHu->colrefs[2]);
out_1SHu_sumtestc.init("sumtestc");
out_1SHu_b.init("b");
out_1SHu_d.init("d");
for(auto& i64 : g7sUysrP) {
auto &key_645kbJO = i64.first;
auto &val_1UnUa89 = i64.second;
out_1SHu_sumtestc.emplace_back(sum(test_c[val_1UnUa89]));
out_1SHu_b.emplace_back(get<1>(key_645kbJO));
out_1SHu_d.emplace_back(get<2>(key_645kbJO));
}
auto d6tAcglo = out_1SHu->order_by_view<-3,1>();
print(d6tAcglo);
return 0;
}

@ -0,0 +1,60 @@
#include "./server/libaquery.h"
#include <unordered_map>
#include "./server/hasher.h"
#include "csv.h"
#include "./server/aggregations.h"
extern "C" int __DLLEXPORT__ dllmain(Context* cxt) {
using namespace std;
using namespace types;
auto sale = new TableInfo<int,int>("sale", 2);
cxt->tables.insert({"sale", sale});
auto& sale_Month = *(ColRef<int> *)(&sale->colrefs[0]);
auto& sale_sales = *(ColRef<int> *)(&sale->colrefs[1]);
sale_Month.init();
sale_sales.init();
io::CSVReader<2> csv_reader_53ychC("moving_avg.csv");
csv_reader_53ychC.read_header(io::ignore_extra_column, "Month","sales");
int tmp_7ttMnHd3;
int tmp_5nHjeAtP;
while(csv_reader_53ychC.read_row(tmp_7ttMnHd3,tmp_5nHjeAtP)) {
sale_Month.emplace_back(tmp_7ttMnHd3);
sale_sales.emplace_back(tmp_5nHjeAtP);
}
auto out_3Xio = new TableInfo<decays<decltype(sale_Month[0])>,decays<decltype(avgw(3,sale_sales))>>("out_3Xio", 2);
cxt->tables.insert({"out_3Xio", out_3Xio});
auto& out_3Xio_Month = *(ColRef<decays<decltype(sale_Month[0])>> *)(&out_3Xio->colrefs[0]);
auto& out_3Xio_avgsw3salesales = *(ColRef<decays<decltype(avgw(3,sale_sales))>> *)(&out_3Xio->colrefs[1]);
out_3Xio_Month.init();
out_3Xio_Month = sale_Month;
out_3Xio_avgsw3salesales.init();
out_3Xio_avgsw3salesales = avgw(3,sale_sales);
// print(*out_3Xio);
FILE* fp_4nKGhD = fopen("moving_avg_output.csv", "w");
out_3Xio->printall(",", "\n", nullptr, fp_4nKGhD);
fclose(fp_4nKGhD);
typedef record<decltype(sale_sales[0])> record_type1H2vDGL;
unordered_map<record_type1H2vDGL, vector_type<uint32_t>, transTypes<record_type1H2vDGL, hasher>> g6Mjxfk5;
for (uint32_t i7u = 0; i7u < sale_sales.size; ++i7u){
g6Mjxfk5[forward_as_tuple(sale_sales[i7u])].emplace_back(i7u);
}
auto out_2IU2 = new TableInfo<decays<decltype(sale_sales[0])>,decays<decltype(minw(2,sale_Month))>>("out_2IU2", 2);
cxt->tables.insert({"out_2IU2", out_2IU2});
auto& out_2IU2_sales = *(ColRef<decays<decltype(sale_sales[0])>> *)(&out_2IU2->colrefs[0]);
auto& out_2IU2_minsw2saleMonth = *(ColRef<decays<decltype(minw(2,sale_Month))>> *)(&out_2IU2->colrefs[1]);
out_2IU2_sales.init();
out_2IU2_minsw2saleMonth.init();
for(auto& i5J : g6Mjxfk5) {
auto &key_4jl5toH = i5J.first;
auto &val_VJGwVwH = i5J.second;
out_2IU2_sales.emplace_back(get<0>(key_4jl5toH));
out_2IU2_minsw2saleMonth.emplace_back(minw(2,sale_Month[val_VJGwVwH]));
}
// print(*out_2IU2);
FILE* fp_18R4fY = fopen("flatten.csv", "w");
out_2IU2->printall(",","\n", nullptr, fp_18R4fY);
fclose(fp_18R4fY);
return 0;
}

@ -0,0 +1,289 @@
import enum
import re
import time
import dbconn
from mo_parsing import ParseException
import aquery_parser as parser
import engine
import reconstruct as xengine
import subprocess
import mmap
import sys
import os
from engine.utils import base62uuid
import atexit
import threading
import ctypes
class RunType(enum.Enum):
Threaded = 0
IPC = 1
server_mode = RunType.Threaded
server_bin = 'server.bin' if server_mode == RunType.IPC else 'server.so'
try:
os.remove(server_bin)
except Exception as e:
print(type(e), e)
nullstream = open(os.devnull, 'w')
subprocess.call(['make', server_bin], stdout=nullstream)
cleanup = True
def rm():
global cleanup
if cleanup:
mm.seek(0,os.SEEK_SET)
mm.write(b'\x00\x00')
mm.flush()
try:
time.sleep(.001)
server.kill()
time.sleep(.001)
server.terminate()
except OSError:
pass
files = os.listdir('.')
for f in files:
if f.endswith('.shm'):
os.remove(f)
mm.close()
cleanup = False
nullstream.close()
def init_ipc():
global shm, server, basecmd, mm
shm = base62uuid()
if sys.platform != 'win32':
import readline
shm += '.shm'
basecmd = ['bash', '-c', 'rlwrap k']
mm = None
if not os.path.isfile(shm):
# create initial file
with open(shm, "w+b") as handle:
handle.write(b'\x01\x00') # [running, new job]
handle.flush()
mm = mmap.mmap(handle.fileno(), 2, access=mmap.ACCESS_WRITE, offset=0)
if mm is None:
exit(1)
else:
basecmd = ['bash.exe', '-c', 'rlwrap ./k']
mm = mmap.mmap(0, 2, shm)
mm.write(b'\x01\x00')
mm.flush()
server = subprocess.Popen(["./server.bin", shm])
import numpy as np
c = lambda _ba: ctypes.cast((ctypes.c_char * len(_ba)).from_buffer(_ba), ctypes.c_char_p)
class Config:
def __init__(self, nq = 0, mode = server_mode, n_bufs = 0, bf_szs = []) -> None:
self.int_size = 4
self.n_attrib = 4
self.buf = bytearray((self.n_attrib + n_bufs) * self.int_size)
self.np_buf = np.ndarray(shape=(self.n_attrib), buffer=self.buf, dtype=np.int32)
self.new_query = nq
self.server_mode = mode.value
self.running = 1
self.n_buffers = n_bufs
@property
def running(self):
return self.np_buf[0]
@running.setter
def running(self, rn):
self.np_buf[0] = rn
@property
def new_query(self):
return self.np_buf[1]
@new_query.setter
def new_query(self, nq):
self.np_buf[1] = nq
@property
def server_mode(self):
return self.np_buf[2]
@server_mode.setter
def server_mode(self, mode):
self.np_buf[2] = mode
@property
def n_buffers(self):
return self.np_buf[3]
@n_buffers.setter
def n_buffers(self, n_bufs):
self.np_buf[3] = n_bufs
def set_bufszs(self, buf_szs):
for i in range(min(len(buf_szs), self.n_buffers)):
self.np_buf[i+self.n_attrib] = buf_szs[i]
@property
def c(self):
return c(self.buf)
cfg = Config()
th = None
def init_threaded():
if os.name == 'nt':
t = os.environ['PATH'].lower().split(';')
vars = re.compile('%.*%')
for e in t:
if(len(e) != 0):
if '%' in e:
try:
m_e = vars.findall(e)
for m in m_e:
e = e.replace(m, os.environ[m[1:-1]])
# print(m, e)
except Exception:
continue
os.add_dll_directory(e)
server_so = ctypes.CDLL('./'+server_bin)
global cfg, th
th = threading.Thread(target=server_so['main'], args=(-1, ctypes.POINTER(ctypes.c_char_p)(cfg.c)), daemon=True)
th.start()
if server_mode == RunType.IPC:
atexit.register(rm)
init = init_ipc
set_ready = lambda : mm.seek(0,os.SEEK_SET) or mm.write(b'\x01\x01')
def __get_ready():
mm.seek(0,os.SEEK_SET)
return mm.read(2)[1]
get_ready = __get_ready
server_status = lambda : server.poll() is not None
else:
init = init_threaded
rm = lambda: None
def __set_ready():
global cfg
cfg.new_query = 1
set_ready = __set_ready
get_ready = lambda:cfg.new_query
server_status = lambda : not th.is_alive()
init()
test_parser = True
# code to test parser
ws = re.compile(r'\s+')
q = 'SELECT p.Name, v.Name FROM Production.Product p JOIN Purchasing.ProductVendor pv ON p.ProductID = pv.ProductID JOIN Purchasing.Vendor v ON pv.BusinessEntityID = v.BusinessEntityID WHERE ProductSubcategoryID = 15 ORDER BY v.Name;'
res = parser.parse(q)
keep = True
cxt = engine.initialize()
cxt.Info(res)
while test_parser:
try:
if server_status():
init()
while get_ready():
time.sleep(.00001)
print("> ", end="")
q = input().lower()
if q == 'exec':
if not keep or cxt is None:
cxt = engine.initialize()
else:
cxt.new()
stmts_stmts = stmts['stmts']
if type(stmts_stmts) is list:
for s in stmts_stmts:
engine.generate(s, cxt)
else:
engine.generate(stmts_stmts, cxt)
cxt.Info(cxt.ccode)
with open('out.cpp', 'wb') as outfile:
outfile.write((cxt.finalize()).encode('utf-8'))
if subprocess.call(['make', 'snippet'], stdout = nullstream) == 0:
set_ready()
continue
if q == 'xexec':
cxt = xengine.initialize()
stmts_stmts = stmts['stmts']
if type(stmts_stmts) is list:
for s in stmts_stmts:
xengine.generate(s, cxt)
else:
xengine.generate(stmts_stmts, cxt)
print(cxt.sql)
continue
elif q.startswith('log'):
qs = re.split(' |\t', q)
if len(qs) > 1:
cxt.log_level = qs[1]
else:
cxt.print(cxt.log_level)
continue
elif q == 'k':
subprocess.call(basecmd)
continue
elif q == 'print':
cxt.print(stmts)
continue
elif q == 'keep':
keep = not keep
continue
elif q == 'format' or q == 'fmt':
subprocess.call(['clang-format', 'out.cpp'])
elif q == 'exit':
break
elif q == 'r':
if subprocess.call(['make', 'snippet']) == 0:
set_ready()
continue
elif q == 'rr':
set_ready()
continue
elif q.startswith('save'):
filename = re.split(' |\t', q)
if (len(filename) > 1):
filename = filename[1]
else:
filename = f'out_{base62uuid(4)}.cpp'
with open(filename, 'wb') as outfile:
outfile.write((cxt.finalize()).encode('utf-8'))
continue
trimed = ws.sub(' ', q.lower()).split(' ')
if trimed[0].startswith('f'):
fn = 'stock.a' if len(trimed) <= 1 or len(trimed[1]) == 0 \
else trimed[1]
with open(fn, 'r') as file:
contents = file.read()#.lower()
stmts = parser.parse(contents)
continue
stmts = parser.parse(q)
cxt.Info(stmts)
except ParseException as e:
print(e)
continue
except (ValueError, FileNotFoundError) as e:
# rm()
# init()
print(e)
except (KeyboardInterrupt):
break
except (Exception) as e:
rm()
raise e
rm()

112
q.sql

@ -0,0 +1,112 @@
FUNCTION
execStrategy ( alloc , mavgday , mavgmonth , px ) {
buySignal := mavgday > mavgmonth ;
f := a + b ;
alloc * prd (
CASE maxs ( buySignal )
WHEN TRUE THEN
CASE buySignal
WHEN TRUE THEN 1 / px
ELSE px
END
ELSE 1
END )
}
FUNCTION covariance (x , y ) {
xmean := avg (x) ;
ymean := avg (y) ;
avg (( x - xmean ) * (y - ymean ))
}
FUNCTION sd ( x) {
sqrt ( covariance (x , x) )
}
FUNCTION pairCorr (x , y ) {
covariance (x , y ) / ( sd (x) * sd (y ))
}
<k>
`
p:5
q:2
phi:(p+1)?1.
theta:q?1.
"p q phi theta"
p
q
phi
theta
l:()
e:()
`
L1:10?20
Le1:10?2.
L2:3?20
Le2:3?2.
"L1 Le1 L2 Le2"
L1
Le1
L2
Le2
`
"Add L1, then predict"
l:l,L1
e:e,Le1
predict:(phi(0)) + (sum ({[x](phi(x+1)) * (l(((#l)-1)-x))}[!p])) - (sum ({[x](theta(x)) * (e(((#e)-1)-x))}[!q]))
predict
`
"Add L2, then predict"
l:l,L2
e:e,Le2
predict:(phi(0)) + (sum ({[x](phi(x+1)) * (l(((#l)-1)-x))}[!p])) - (sum ({[x](theta(x)) * (e(((#e)-1)-x))}[!q]))
predict
</k>
WITH
Target (Id , TradeDate , ClosePrice ) AS
( SELECT
Id , TradeDate , ClosePrice
FROM price
WHERE Id IN stock10 AND
TradeDate >= startYear10 AND
TradeDate <= startYear10 + 365 * 10),
weekly (Id , bucket , name , low , high , mean ) AS
( SELECT
Id ,
timeBucket ,
" weekly " ,
min ( ClosePrice ) ,
max ( ClosePrice ) ,
avg ( ClosePrice )
FROM Target
GROUP BY Id , getWeek ( TradeDate ) as
timeBucket ),
monthly ( Id , bucket , name , low , high , mean ) AS
( SELECT
Id ,
timeBucket ,
" monthly " ,
min ( ClosePrice ) ,
max ( ClosePrice ) ,
avg ( ClosePrice )
FROM Target
GROUP BY Id , getMonth ( TradeDate ) as
timeBucket ),
yearly (Id , bucket , name , low , high , mean ) AS
( SELECT
Id ,
timeBucket ,
" yearly " ,
min ( ClosePrice ) ,
max ( ClosePrice ) ,
avg ( ClosePrice )
FROM Target
GROUP BY Id , getYear ( TradeDate ) as
timeBucket )
SELECT
Id , bucket , name , low , high , mean
FROM
CONCATENATE ( weekly , monthly , yearly )
ASSUMING ASC Id , ASC name , ASC bucket

@ -0,0 +1,10 @@
CREATE TABLE test(a INT, b INT, c INT, d INT)
LOAD DATA INFILE "test.csv"
INTO TABLE test
FIELDS TERMINATED BY ","
SELECT sum(c), b, d
FROM test
group by a,b,d
order by d DESC, b ASC

@ -0,0 +1,11 @@
from reconstruct.ast import Context, ast_node
def initialize():
return Context()
def generate(ast, cxt):
for k in ast.keys():
if k in ast_node.types.keys():
ast_node.types[k](None, ast, cxt)
__all__ = ["initialize", "generate"]

@ -0,0 +1,303 @@
from engine.utils import enlist, base62uuid, base62alp
from reconstruct.storage import Context, TableInfo, ColRef
class ast_node:
header = []
types = dict()
first_order = False
def __init__(self, parent:"ast_node", node, context:Context = None):
self.context = parent.context if context is None else context
self.parent = parent
self.sql = ''
self.datasource = None
self.init(node)
self.produce(node)
self.spawn(node)
self.consume(node)
def emit(self, code):
self.context.emit(code)
def add(self, code):
self.sql += code + ' '
name = 'null'
def init(self, _):
self.add(self.__class__.name.upper())
def produce(self, _):
pass
def spawn(self, _):
pass
def consume(self, _):
if self.parent is None:
self.emit(self.sql+';\n')
from reconstruct.expr import expr
class projection(ast_node):
name = 'projection'
first_order = 'select'
def init(self, _):
pass
def produce(self, node):
p = node['select']
self.projections = p if type(p) is list else [p]
self.add('SELECT')
def spawn(self, node):
self.datasource = None # datasource is Join instead of TableInfo
if 'from' in node:
from_clause = node['from']
self.datasource = join(self, from_clause)
if 'assumptions' in from_clause:
self.assumptions = enlist(from_clause['assumptions'])
if self.datasource is not None:
self.datasource_changed = True
self.prev_datasource = self.context.datasource
self.context.datasource = self.datasource
if 'where' in node:
self.where = filter(self, node['where'])
else:
self.where = None
if 'groupby' in node:
self.group_node = groupby(self, node['groupby'])
else:
self.group_node = None
def consume(self, node):
# deal with projections
self.out_table = TableInfo('out_'+base62uuid(4), [], self.context)
cols = []
col_exprs = []
for i, proj in enumerate(self.projections):
compound = False
self.datasource.rec = set()
name = ''
if type(proj) is dict:
if 'value' in proj:
e = proj['value']
name = expr(self, e).sql
disp_name = ''.join([a if a in base62alp else '' for a in name])
compound = True # compound column
if 'name' in proj: # renaming column by AS keyword
name += ' ' + proj['name']
col_exprs.append(name)
elif type(proj) is str:
col = self.datasource.get_col(proj)
name = col.name
self.datasource.rec = None
# TODO: Type deduction in Python
cols.append(ColRef('unknown', self.out_table, None, disp_name, i, compound=compound))
self.add(', '.join(col_exprs))
def finialize(astnode:ast_node):
if(astnode is not None):
self.add(astnode.sql)
self.add('FROM')
finialize(self.datasource)
finialize(self.where)
finialize(self.group_node)
if 'orderby' in node:
self.add(orderby(self, node['orderby']).sql)
if 'outfile' in node:
self.add(outfile(self, node['outfile']).sql)
if self.parent is None:
self.emit(self.sql+';\n')
else:
# TODO: subquery, name create tmp-table from subquery w/ alias as name
pass
class orderby(ast_node):
name = 'order by'
def produce(self, node):
if node is None:
self.sql = ''
return
elif type(node) is not list:
node = [node]
o_list = []
for o in node:
o_str = expr(self, o['value']).sql
if 'sort' in o and f'{o["sort"]}'.lower() == 'desc':
o_str += ' ' + 'DESC'
o_list.append(o_str)
self.add(', '.join(o_list))
class groupby(orderby):
name = 'group by'
class join(ast_node):
name = 'join'
def init(self, _):
self.joins:list = []
self.tables = []
self.tables_dir = dict()
# self.tmp_name = 'join_' + base62uuid(4)
# self.datasource = TableInfo(self.tmp_name, [], self.context)
def append(self, tbls, __alias = ''):
alias = lambda t : '(' + t + ') ' + __alias if len(__alias) else t
if type(tbls) is join:
self.joins.append(alias(tbls.__str__()))
self.tables += tbls.tables
self.tables_dir = {**self.tables_dir, **tbls.tables_dir}
elif type(tbls) is TableInfo:
self.joins.append(alias(tbls.table_name))
self.tables.append(tbls)
self.tables_dir[tbls.table_name] = tbls
for a in tbls.alias:
self.tables_dir[a] = tbls
elif type(tbls) is projection:
self.joins.append(alias(tbls.finalize()))
def produce(self, node):
if type(node) is list:
for d in node:
self.append(join(self, d).__str__())
elif type(node) is dict:
alias = ''
if 'value' in node:
table_name = node['value']
tbl = None
if 'name' in node:
alias = node['name']
if type(table_name) is dict:
if 'select' in table_name:
# TODO: subquery, create and register TableInfo in projection
tbl = projection(self, table_name).finalize()
else:
tbl = self.context.tables_byname[table_name]
if 'name' in node:
tbl.add_alias(node['name'])
self.append(tbl, alias)
else:
keys = node.keys()
if keys[0].lower().endswith('join'):
j = join(self, node[keys[0]])
tablename = f' {keys[0]} {j}'
if keys[1].lower() == 'on':
tablename += f' on {expr(self, node[keys[1]])}'
self.joins.append(tablename)
self.tables += j.tables
self.tables_dir = {**self.tables_dir, **j.tables_dir}
elif type(node) is str:
self.append(self.context.tables_byname[node])
def get_cols(self, colExpr: str) -> ColRef:
for t in self.tables:
if colExpr in t.columns_byname:
return t.columns_byname[colExpr]
def parse_col_names(self, colExpr:str) -> ColRef:
parsedColExpr = colExpr.split('.')
if len(parsedColExpr) <= 1:
return self.get_cols(colExpr)
else:
datasource = self.tables_dir[parsedColExpr[0]]
if datasource is None:
raise ValueError(f'Table name/alias not defined{parsedColExpr[0]}')
else:
return datasource.parse_col_names(parsedColExpr[1])
def consume(self, _):
self.sql = ', '.join(self.joins)
return super().consume(_)
def __str__(self):
return ', '.join(self.joins)
def __repr__(self):
return self.__str__()
class filter(ast_node):
name = 'where'
def produce(self, node):
self.add(expr(self, node).sql)
class create_table(ast_node):
name = 'create_table'
first_order = name
def init(self, node):
self.sql = 'CREATE TABLE '
def produce(self, node):
ct = node[self.name]
tbl = self.context.add_table(ct['name'], ct['columns'])
self.sql = f'CREATE TABLE {tbl.table_name}('
columns = []
for c in tbl.columns:
columns.append(f'{c.name} {c.type.upper()}')
self.sql += ', '.join(columns)
self.sql += ')'
class insert(ast_node):
name = 'insert'
first_order = name
def produce(self, node):
values = node['query']['select']
tbl = node['insert']
self.sql = f'INSERT INTO {tbl} VALUES('
# if len(values) != table.n_cols:
# raise ValueError("Column Mismatch")
list_values = []
for i, s in enumerate(values):
if 'value' in s:
list_values.append(f"{s['value']}")
else:
# subquery, dispatch to select astnode
pass
self.sql += ', '.join(list_values) + ')'
class load(ast_node):
name="load"
first_order = name
def produce(self, node):
node = node['load']
s1 = 'LOAD DATA INFILE '
s2 = 'INTO TABLE '
s3 = 'FIELDS TERMINATED BY '
self.sql = f'{s1} \"{node["file"]["literal"]}\" {s2} {node["table"]}'
if 'term' in node:
self.sql += f' {s3} \"{node["term"]["literal"]}\"'
class outfile(ast_node):
name="_outfile"
def produce(self, node):
filename = node['loc']['literal'] if 'loc' in node else node['literal']
self.sql = f'INTO OUTFILE "{filename}"'
if 'term' in node:
self.sql += f' FIELDS TERMINATED BY \"{node["term"]["literal"]}\"'
def include(objs):
import inspect
for _, cls in inspect.getmembers(objs):
if inspect.isclass(cls) and issubclass(cls, ast_node) and type(cls.first_order) is str:
ast_node.types[cls.first_order] = cls
import sys
include(sys.modules[__name__])

@ -0,0 +1,128 @@
from reconstruct.ast import ast_node
from reconstruct.storage import ColRef, TableInfo
class expr(ast_node):
name='expr'
builtin_func_maps = {
'max': 'MAX',
'min': 'MIN',
'avg': 'AVG',
'sum': 'SUM',
'count' : 'COUNT',
'mins': ['mins', 'minw'],
'maxs': ['maxs', 'maxw'],
'avgs': ['avgs', 'avgw'],
'sums': ['sums', 'sumw'],
}
binary_ops = {
'sub':'-',
'add':'+',
'mul':'*',
'div':'/',
'mod':'%',
'and':' AND ',
'or':' OR ',
'xor' : ' XOR ',
'gt':'>',
'lt':'<',
'le':'<=',
'gt':'>='
}
compound_ops = {
}
unary_ops = {
'neg' : '-',
'not' : ' NOT '
}
coumpound_generating_ops = ['avgs', 'mins', 'maxs', 'sums'] + \
list(binary_ops.keys()) + list(compound_ops.keys()) + list(unary_ops.keys() )
def __init__(self, parent, node):
self.raw_col = None
self.inside_agg = False
if(type(parent) is expr):
self.inside_agg = parent.inside_agg
ast_node.__init__(self, parent, node, None)
def init(self, _):
from engine.projection import projection
parent = self.parent
self.isvector = parent.isvector if type(parent) is expr else False
self.is_compound = parent.is_compound if type(parent) is expr else False
if type(parent) in [projection, expr]:
self.datasource = parent.datasource
else:
self.datasource = self.context.datasource
self.udf_map = parent.context.udf_map
self.func_maps = {**self.udf_map, **self.builtin_func_maps}
def produce(self, node):
if type(node) is dict:
for key, val in node.items():
if key in self.func_maps:
# TODO: distinguish between UDF agg functions and other UDF functions.
self.inside_agg = True
if type(val) is list and len(val) > 1:
cfunc = self.func_maps[key]
cfunc = cfunc[len(val) - 1] if type(cfunc) is list else cfunc
self.sql += f"{cfunc}("
for i, p in enumerate(val):
self.sql += expr(self, p).sql + (',' if i < len(val) - 1 else '')
else:
funcname = self.func_maps[key]
funcname = funcname[0] if type(funcname) is list else funcname
self.sql += f"{funcname}("
self.sql += expr(self, val).sql
self.sql += ')'
self.inside_agg = False
elif key in self.binary_ops:
l = expr(self, val[0]).sql
r = expr(self, val[1]).sql
self.sql += f'({l}{self.binary_ops[key]}{r})'
elif key in self.compound_ops:
x = []
if type(val) is list:
for v in val:
x.append(expr(self, v).sql)
self.sql = self.compound_ops[key][1](x)
elif key in self.unary_ops:
self.sql += f'{self.unary_ops[key]}({expr(self, val).sql})'
else:
print(f'Undefined expr: {key}{val}')
if key in self.coumpound_generating_ops and not self.is_compound:
self.is_compound = True
p = self.parent
while type(p) is expr and not p.is_compound:
p.is_compound = True
p = p.parent
elif type(node) is str:
p = self.parent
while type(p) is expr and not p.isvector:
p.isvector = True
p = p.parent
self.raw_col = self.datasource.parse_col_names(node)
self.raw_col = self.raw_col if type(self.raw_col) is ColRef else None
if self.raw_col is not None:
self.sql = self.raw_col.name
else:
self.sql = node
elif type(node) is bool:
self.sql = '1' if node else '0'
else:
self.sql = f'{node}'
def __str__(self):
return self.sql
def __repr__(self):
return self.__str__()

@ -0,0 +1,89 @@
class ColRef:
def __init__(self, _ty, cobj, table:'TableInfo', name, id, compound = False):
self.type = _ty
self.cobj = cobj
self.table = table
self.name = name
self.alias = set()
self.id = id # position in table
self.compound = compound # compound field (list as a field)
# e.g. order by, group by, filter by expressions
self.__arr__ = (_ty, cobj, table, name, id)
def __getitem__(self, key):
if type(key) is str:
return getattr(self, key)
else:
return self.__arr__[key]
def __setitem__(self, key, value):
self.__arr__[key] = value
class TableInfo:
def __init__(self, table_name, cols, cxt:'Context'):
# statics
self.table_name = table_name
self.alias = set([table_name])
self.columns_byname = dict() # column_name, type
self.columns = []
self.cxt = cxt
# keep track of temp vars
self.rec = None
self.add_cols(cols)
# runtime
self.order = [] # assumptions
cxt.tables_byname[self.table_name] = self # construct reverse map
def add_cols(self, cols, new = True):
for i, c in enumerate(cols):
self.add_col(c, new, i)
def add_col(self, c, new = True, i = 0):
_ty = c['type']
if new:
_ty = _ty if type(c) is ColRef else list(_ty.keys())[0]
col_object = ColRef(_ty, c, self, c['name'], len(self.columns))
else:
col_object = c
c.table = self
self.columns_byname[c['name']] = col_object
self.columns.append(col_object)
def add_alias(self, alias):
if alias in self.cxt.tables_byname.keys():
print("Error: table alias already exists")
return
self.cxt.tables_byname[alias] = self
self.alias.add(alias)
def parse_col_names(self, colExpr) -> ColRef:
parsedColExpr = colExpr.split('.')
if len(parsedColExpr) <= 1:
return self.columns_byname[colExpr]
else:
datasource = self.cxt.tables_byname[parsedColExpr[0]]
if datasource is None:
raise ValueError(f'Table name/alias not defined{parsedColExpr[0]}')
else:
return datasource.parse_col_names(parsedColExpr[1])
class Context:
def __init__(self):
self.sql = ''
self.tables_byname = dict()
self.col_byname = dict()
self.tables = []
self.cols = []
self.datasource = None
self.udf_map = dict()
def emit(self, sql:str):
self.sql += sql + ' '
def add_table(self, table_name, cols):
tbl = TableInfo(table_name, cols, self)
self.tables.append(tbl)
return tbl

@ -0,0 +1,5 @@
mo-future
mo-dots==8.20.21357
mo-parsing
mo-imports
readline; sys_platform != 'win32'

Binary file not shown.

@ -0,0 +1,6 @@
debug:
g++ -g3 -O0 server/server.cpp server/table.cpp -o a.out -Wall -Wextra -Wpedantic -lpthread
test:
g++ --std=c++1z -g3 -O0 server.cpp table.cpp -o a.out -Wall -Wextra -Wpedantic -lpthread

@ -0,0 +1,158 @@
#pragma once
#include "types.h"
#include <utility>
#include <limits>
#include <deque>
template <class T, template<typename ...> class VT>
size_t count(const VT<T>& v) {
return v.size;
}
template <class T>
constexpr static inline size_t count(const T&) { return 1; }
// TODO: Specializations for dt/str/none
template<class T, template<typename ...> class VT>
types::GetLongType<T> sum(const VT<T>& v) {
types::GetLongType<T> ret = 0;
for (const auto& _v : v)
ret += _v;
return ret;
}
template<class T, template<typename ...> class VT>
types::GetFPType<T> avg(const VT<T>& v) {
return static_cast<types::GetFPType<T>>(
sum<T>(v) / static_cast<long double>(v.size));
}
template <class T, template<typename ...> class VT>
T max(const VT<T>& v) {
T max_v = std::numeric_limits<T>::min();
for (const auto& _v : v)
max_v = max_v > _v ? max_v : _v;
return max_v;
}
template <class T, template<typename ...> class VT>
T min(const VT<T>& v) {
T min_v = std::numeric_limits<T>::max();
for (const auto& _v : v)
min_v = min_v < _v ? min_v : _v;
return min_v;
}
template<class T, template<typename ...> class VT>
decayed_t<VT,T> mins(const VT<T>& arr) {
const uint32_t& len = arr.size;
std::deque<std::pair<T, uint32_t>> cache;
decayed_t<VT,T> ret(len);
T min = std::numeric_limits<T>::max();
for (int i = 0; i < len; ++i) {
if (arr[i] < min)
min = arr[i];
ret[i] = min;
}
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT,T> maxs(const VT<T>& arr) {
const uint32_t& len = arr.size;
decayed_t<VT,T> ret(len);
T max = std::numeric_limits<T>::min();
for (int i = 0; i < len; ++i) {
if (arr[i] > max)
max = arr[i];
ret[i] = max;
}
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT,T> minw(uint32_t w, const VT<T>& arr) {
const uint32_t& len = arr.size;
decayed_t<VT,T> ret{len};
std::deque<std::pair<T, uint32_t>> cache;
for (int i = 0; i < len; ++i) {
if (!cache.empty() && cache.front().second == i - w) cache.pop_front();
while (!cache.empty() && cache.back().first > arr[i]) cache.pop_back();
cache.push_back({ arr[i], i });
ret[i] = cache.front().first;
}
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT,T> maxw(uint32_t w, const VT<T>& arr) {
const uint32_t& len = arr.size;
decayed_t<VT, T> ret(len);
std::deque<std::pair<T, uint32_t>> cache;
for (int i = 0; i < len; ++i) {
if (!cache.empty() && cache.front().second == i - w) cache.pop_front();
while (!cache.empty() && cache.back().first > arr[i]) cache.pop_back();
cache.push_back({ arr[i], i });
arr[i] = cache.front().first;
}
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT, types::GetLongType<T>> sums(const VT<T>& arr) {
const uint32_t& len = arr.size;
decayed_t<VT, types::GetLongType<T>> ret(len);
uint32_t i = 0;
if(len) ret[i++] = arr[0];
for (; i < len; ++i)
ret[i] = ret[i-1] + arr[i];
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT, types::GetFPType<T>> avgs(const VT<T>& arr) {
const uint32_t& len = arr.size;
typedef types::GetFPType<T> FPType;
decayed_t<VT, FPType> ret(len);
uint32_t i = 0;
types::GetLongType<T> s;
if(len) s = ret[i++] = arr[0];
for (; i < len; ++i)
ret[i] = (s+=arr[i])/(FPType)(i+1);
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT, types::GetLongType<T>> sumw(uint32_t w, const VT<T>& arr) {
const uint32_t& len = arr.size;
decayed_t<VT, types::GetLongType<T>> ret(len);
uint32_t i = 0;
w = w > len ? len : w;
if(len) ret[i++] = arr[0];
for (; i < w; ++i)
ret[i] = ret[i-1] + arr[i];
for (; i < len; ++i)
ret[i] = ret[i-1] + arr[i] - arr[i-w];
return ret;
}
template<class T, template<typename ...> class VT>
decayed_t<VT, types::GetFPType<T>> avgw(uint32_t w, const VT<T>& arr) {
typedef types::GetFPType<T> FPType;
const uint32_t& len = arr.size;
decayed_t<VT, FPType> ret(len);
uint32_t i = 0;
types::GetLongType<T> s{};
w = w > len ? len : w;
if(len) s = ret[i++] = arr[0];
for (; i < w; ++i)
ret[i] = (s += arr[i])/(FPType)(i+1);
for (; i < len; ++i)
ret[i] = ret[i-1] + (arr[i] - arr[i-w])/(FPType)w;
return ret;
}
template <class T> constexpr inline T count(const T& v) { return 1; }
template <class T> constexpr inline T max(const T& v) { return v; }
template <class T> constexpr inline T min(const T& v) { return v; }
template <class T> constexpr inline T avg(const T& v) { return v; }
template <class T> constexpr inline T sum(const T& v) { return v; }
template <class T> constexpr inline T maxw(uint32_t, const T& v) { return v; }
template <class T> constexpr inline T minw(uint32_t, const T& v) { return v; }
template <class T> constexpr inline T avgw(uint32_t, const T& v) { return v; }
template <class T> constexpr inline T sumw(uint32_t, const T& v) { return v; }
template <class T> constexpr inline T maxs(const T& v) { return v; }
template <class T> constexpr inline T mins(const T& v) { return v; }
template <class T> constexpr inline T avgs(const T& v) { return v; }
template <class T> constexpr inline T sums(const T& v) { return v; }

@ -0,0 +1,6 @@
// Hint files help the Visual Studio IDE interpret Visual C++ identifiers
// such as names of functions and macros.
// For more information see https://go.microsoft.com/fwlink/?linkid=865984
#define Ops(o) template<typename T> vector_type<typename types::Coercion<_Ty, T>::type> operator##o(const vector_type<T>& r) { [[likely]] if (r.size == size) { return add(r); } else if (r.size == 1 || size == 1) { const bool lr = size == 1; const uint32_t& _size = lr ? r.size : size; const auto& _container = lr ? r.container : container; const auto& scalar = *(lr ? container : r.container); vector_type<typename types::Coercion<_Ty, T>::type> ret(_size); for (int i = 0; i < _size; ++i) ret[i] = _container[i] o scalar; return ret; } }
#define Op(o, x) template<typename T> vector_type<typename types::Coercion<_Ty, T>::type> inline x(const vector_type<T>& r) { vector_type<typename types::Coercion<_Ty, T>::type> ret(size); for (int i = 0; i < size; ++i) ret[i] = container[i] o r[i]; return ret; }
#define _Make_Ops(M) M(+, add) M(-, minus) M(*, multi) M(/, div) M(%, mod) M(&, and) M(|, or) M(^, xor)

@ -0,0 +1,54 @@
#pragma once
#include <vector>
#include <utility>
#include <thread>
#include <chrono>
class GC {
template<class T>
using vector = std::vector<T>;
template<class ...T>
using tuple = std::tuple<T...>;
size_t current_size, max_size, interval, forced_clean;
bool running, alive;
// ptr, dealloc, ref, sz
vector<tuple<void*, void (*)(void*), uint32_t, uint32_t>> q;
std::thread handle;
void gc()
{
}
template <class T>
void reg(T* v, uint32_t ref, uint32_t sz,
void(*f)(void*) = [](void* v) {delete[] ((T*)v); }) {
current_size += sz;
if (current_size > max_size)
gc();
q.push_back({ v, f, ref, sz });
}
void daemon() {
using namespace std::chrono;
while (alive) {
if (running) {
gc();
std::this_thread::sleep_for(microseconds(interval));
}
else {
std::this_thread::sleep_for(10ms);
}
}
}
void start_deamon() {
handle = std::thread(&daemon);
alive = true;
}
void terminate_daemon() {
running = false;
alive = false;
using namespace std::chrono;
if (handle.joinable()) {
std::this_thread::sleep_for(microseconds(1000 + std::max(static_cast<size_t>(10000), interval)));
handle.join();
}
}
};

@ -0,0 +1,20 @@
#pragma once
#include <functional>
#include <tuple>
template <class ...Types>
struct hasher {
template <size_t i = 0> typename std::enable_if< i == sizeof...(Types),
size_t>::type hashi(const std::tuple<Types...>& record) const {
return 0;
}
template <size_t i = 0> typename std::enable_if< i < sizeof ...(Types),
size_t>::type hashi(const std::tuple<Types...>& record) const {
using current_type = typename std::decay<typename std::tuple_element<i, std::tuple<Types...>>::type>::type;
return std::hash<current_type>()(std::get<i>(record)) ^ hashi<i+1>(record);
}
size_t operator()(const std::tuple<Types...>& record) const {
return hashi(record);
}
};

@ -0,0 +1,15 @@
#pragma once
#include "types.h"
#include <cstdio>
#include <string>
template <class ...Types>
std::string generate_printf_string(const char* sep = " ", const char* end = "\n") {
std::string str;
((str += types::printf_str[types::Types<value_type_r<Types>>::getType()], str += sep), ...);
const auto trim = str.size() - strlen(sep);
if (trim > 0)
str.resize(trim);
str += end;
return str;
}

@ -0,0 +1,48 @@
#ifndef _AQUERY_H
#define _AQUERY_H
#include "table.h"
#include <unordered_map>
enum Log_level {
LOG_INFO,
LOG_ERROR,
LOG_SILENT
};
struct Config{
int running, new_query, server_mode, n_buffers;
int buffer_sizes[];
};
struct Context{
typedef int (*printf_type) (const char *format, ...);
std::unordered_map<const char*, void*> tables;
std::unordered_map<const char*, uColRef *> cols;
Config* cfg;
int n_buffers, *sz_bufs;
void **buffers;
Log_level log_level = LOG_SILENT;
printf_type print = printf;
template <class ...Types>
void log(Types... args) {
if (log_level == LOG_INFO)
print(args...);
}
template <class ...Types>
void err(Types... args) {
if (log_level <= LOG_ERROR)
print(args...);
}
};
#ifdef _WIN32
#define __DLLEXPORT__ __declspec(dllexport) __stdcall
#else
#define __DLLEXPORT__
#endif
#endif

@ -0,0 +1,19 @@
#pragma once
#include "vector_type.hpp"
#include <algorithm>
#include <stdint.h>
template <class Comparator, typename T = uint32_t>
class priority_vector : public vector_type<T> {
const Comparator comp;
public:
priority_vector(Comparator comp = std::less<T>{}) :
comp(comp), vector_type<T>(0) {}
void emplace_back(T val) {
vector_type<T>::emplace_back(val);
std::push_heap(container, container + size, comp);
}
void pop_back() {
std::pop_heap(container, container + size, comp);
--size;
}
};

@ -0,0 +1,135 @@
#include "../csv.h"
#include <iostream>
#include <string>
//#include <thread>
#include <chrono>
#include "libaquery.h"
#ifdef _WIN32
#include "winhelper.h"
#else
#include <dlfcn.h>
#include <fcntl.h>
#include <sys/mman.h>
struct SharedMemory
{
int hFileMap;
void* pData;
SharedMemory(const char* fname) {
hFileMap = open(fname, O_RDWR, 0);
if (hFileMap != -1)
pData = mmap(NULL, 8, PROT_READ | PROT_WRITE, MAP_SHARED, hFileMap, 0);
else
pData = 0;
}
void FreeMemoryMap() {
}
};
#endif
struct thread_context{
}v;
void daemon(thread_context* c) {
}
#include "aggregations.h"
typedef int (*code_snippet)(void*);
int test_main();
int dll_main(int argc, char** argv, Context* cxt){
Config *cfg = reinterpret_cast<Config *>(argv[0]);
auto buf_szs = cfg->buffer_sizes;
void** buffers = (void**)malloc(sizeof(void*) * cfg->n_buffers);
for (int i = 0; i < cfg->n_buffers; i++)
buffers[i] = static_cast<void *>(argv[i + 1]);
cxt->buffers = buffers;
cxt->cfg = cfg;
cxt->n_buffers = cfg->n_buffers;
cxt->sz_bufs = buf_szs;
while(cfg->running){
if (cfg->new_query) {
void* handle = dlopen("./dll.so", RTLD_LAZY);
code_snippet c = reinterpret_cast<code_snippet>(dlsym(handle, "dllmain"));
c(cxt);
dlclose(handle);
cfg->new_query = 0;
}
std::this_thread::sleep_for(std::chrono::milliseconds(100));
}
return 0;
}
extern "C" int __DLLEXPORT__ main(int argc, char** argv) {
Context* cxt = new Context();
cxt->log("%d %s\n", argc, argv[1]);
const char* shmname;
if (argc < 0)
return dll_main(argc, argv, cxt);
else if (argc <= 1)
return test_main();
else
shmname = argv[1];
SharedMemory shm = SharedMemory(shmname);
if (!shm.pData)
return 1;
bool &running = static_cast<bool*>(shm.pData)[0],
&ready = static_cast<bool*>(shm.pData)[1];
using namespace std::chrono_literals;
cxt->log("running: %s\n", running? "true":"false");
cxt->log("ready: %s\n", ready? "true":"false");
while (running) {
std::this_thread::sleep_for(1ms);
if(ready){
cxt->log("running: %s\n", running? "true":"false");
cxt->log("ready: %s\n", ready? "true":"false");
void* handle = dlopen("./dll.so", RTLD_LAZY);
cxt->log("handle: %lx\n", handle);
if (handle) {
cxt->log("inner\n");
code_snippet c = reinterpret_cast<code_snippet>(dlsym(handle, "dllmain"));
cxt->log("routine: %lx\n", c);
if (c) {
cxt->log("inner\n");
cxt->err("return: %d\n", c(cxt));
}
}
ready = false;
}
}
shm.FreeMemoryMap();
return 0;
}
#include "utils.h"
int test_main()
{
//vector_type<int> t;
//t = 1;
//t.emplace_back(2);
//print(t);
//return 0;
Context* cxt = new Context();
cxt->log_level = LOG_INFO;
puts(cpp_17 ?"true":"false");
void* handle = dlopen("./dll.so", RTLD_LAZY);
cxt->log("handle: %llx\n", handle);
if (handle) {
cxt->log("inner\n");
code_snippet c = reinterpret_cast<code_snippet>(dlsym(handle, "dllmain"));
cxt->log("routine: %llx\n", c);
if (c) {
cxt->log("inner\n");
cxt->log("return: %d\n", c(cxt));
}
dlclose(handle);
}
//static_assert(std::is_same_v<decltype(fill_integer_array<5, 1>()), std::integer_sequence<bool, 1,1,1,1,1>>, "");
return 0;
std::unordered_map<int, int> a;
}

@ -0,0 +1,62 @@

Microsoft Visual Studio Solution File, Format Version 12.00
# Visual Studio Version 17
VisualStudioVersion = 17.2.32210.308
MinimumVisualStudioVersion = 10.0.40219.1
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "server", "server.vcxproj", "{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}"
ProjectSection(ProjectDependencies) = postProject
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81} = {8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}
EndProjectSection
EndProject
Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "msc-plugin", "..\msc-plugin\msc-plugin.vcxproj", "{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}"
EndProject
Project("{888888A0-9F3D-457C-B088-3A5042F75D52}") = "msvs-py", "..\msvs-py\msvs-py.pyproj", "{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Debug|x64 = Debug|x64
Debug|x86 = Debug|x86
Release|Any CPU = Release|Any CPU
Release|x64 = Release|x64
Release|x86 = Release|x86
EndGlobalSection
GlobalSection(ProjectConfigurationPlatforms) = postSolution
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|Any CPU.ActiveCfg = Debug|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|Any CPU.Build.0 = Debug|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|x64.ActiveCfg = Debug|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|x64.Build.0 = Debug|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|x86.ActiveCfg = Debug|Win32
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Debug|x86.Build.0 = Debug|Win32
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|Any CPU.ActiveCfg = Release|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|Any CPU.Build.0 = Release|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|x64.ActiveCfg = Release|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|x64.Build.0 = Release|x64
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|x86.ActiveCfg = Release|Win32
{031352C2-AFBB-45AA-9518-DBC1F9EF2AF3}.Release|x86.Build.0 = Release|Win32
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|Any CPU.ActiveCfg = Debug|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|Any CPU.Build.0 = Debug|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|x64.ActiveCfg = Debug|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|x64.Build.0 = Debug|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|x86.ActiveCfg = Debug|Win32
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Debug|x86.Build.0 = Debug|Win32
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|Any CPU.ActiveCfg = Release|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|Any CPU.Build.0 = Release|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|x64.ActiveCfg = Release|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|x64.Build.0 = Release|x64
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|x86.ActiveCfg = Release|Win32
{8081FDAA-4D13-4B7A-ADB2-8224AF7F1C81}.Release|x86.Build.0 = Release|Win32
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Debug|x64.ActiveCfg = Debug|Any CPU
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Debug|x86.ActiveCfg = Debug|Any CPU
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Release|Any CPU.ActiveCfg = Release|Any CPU
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Release|x64.ActiveCfg = Release|Any CPU
{CCC243F5-663E-45B7-A6DE-B2468C58B3A7}.Release|x86.ActiveCfg = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {572EA821-8162-4161-9AC2-464C79F08B47}
EndGlobalSection
EndGlobal

@ -0,0 +1,188 @@
<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<ItemGroup Label="ProjectConfigurations">
<ProjectConfiguration Include="Debug|Win32">
<Configuration>Debug</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|Win32">
<Configuration>Release</Configuration>
<Platform>Win32</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Debug|x64">
<Configuration>Debug</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
<ProjectConfiguration Include="Release|x64">
<Configuration>Release</Configuration>
<Platform>x64</Platform>
</ProjectConfiguration>
</ItemGroup>
<PropertyGroup Label="Globals">
<VCProjectVersion>16.0</VCProjectVersion>
<Keyword>Win32Proj</Keyword>
<ProjectGuid>{031352c2-afbb-45aa-9518-dbc1f9ef2af3}</ProjectGuid>
<RootNamespace>server</RootNamespace>
<WindowsTargetPlatformVersion>10.0.19041.0</WindowsTargetPlatformVersion>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.Default.props" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>false</EnableASAN>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>true</EnableASAN>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>true</UseDebugLibraries>
<PlatformToolset>v143</PlatformToolset>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>false</EnableASAN>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">
<ConfigurationType>Application</ConfigurationType>
<UseDebugLibraries>false</UseDebugLibraries>
<PlatformToolset>v141</PlatformToolset>
<WholeProgramOptimization>true</WholeProgramOptimization>
<CharacterSet>Unicode</CharacterSet>
<EnableASAN>true</EnableASAN>
</PropertyGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.props" />
<ImportGroup Label="ExtensionSettings">
</ImportGroup>
<ImportGroup Label="Shared">
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<ImportGroup Label="PropertySheets" Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<Import Project="$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props" Condition="exists('$(UserRootDir)\Microsoft.Cpp.$(Platform).user.props')" Label="LocalAppDataPlatform" />
</ImportGroup>
<PropertyGroup Label="UserMacros" />
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<LinkIncremental>true</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<LinkIncremental>false</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<LinkIncremental>true</LinkIncremental>
</PropertyGroup>
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<LinkIncremental>false</LinkIncremental>
</PropertyGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;WIN32;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<EnableParallelCodeGeneration>true</EnableParallelCodeGeneration>
<LanguageStandard>stdcpplatest</LanguageStandard>
<LanguageStandard_C>stdc17</LanguageStandard_C>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|Win32'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;WIN32;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<EnableParallelCodeGeneration>true</EnableParallelCodeGeneration>
<LanguageStandard>stdcpplatest</LanguageStandard>
<LanguageStandard_C>stdc17</LanguageStandard_C>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<GenerateDebugInformation>true</GenerateDebugInformation>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<EnableParallelCodeGeneration>true</EnableParallelCodeGeneration>
<LanguageStandard>stdcpplatest</LanguageStandard>
<LanguageStandard_C>stdc17</LanguageStandard_C>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<GenerateDebugInformation>true</GenerateDebugInformation>
</Link>
</ItemDefinitionGroup>
<ItemDefinitionGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'">
<ClCompile>
<WarningLevel>Level3</WarningLevel>
<FunctionLevelLinking>true</FunctionLevelLinking>
<IntrinsicFunctions>true</IntrinsicFunctions>
<SDLCheck>true</SDLCheck>
<PreprocessorDefinitions>_CRT_SECURE_NO_WARNINGS;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
<ConformanceMode>true</ConformanceMode>
<MultiProcessorCompilation>true</MultiProcessorCompilation>
<EnableParallelCodeGeneration>true</EnableParallelCodeGeneration>
<LanguageStandard>stdcpplatest</LanguageStandard>
<LanguageStandard_C>stdc17</LanguageStandard_C>
</ClCompile>
<Link>
<SubSystem>Console</SubSystem>
<EnableCOMDATFolding>true</EnableCOMDATFolding>
<OptimizeReferences>true</OptimizeReferences>
<GenerateDebugInformation>true</GenerateDebugInformation>
</Link>
</ItemDefinitionGroup>
<ItemGroup>
<Media Include="..\out.cpp" />
<ClCompile Include="table.cpp" />
<ClCompile Include="server.cpp" />
<ClCompile Include="types.cpp" />
<ClCompile Include="utils.cpp" />
<ClCompile Include="vector_type.cpp" />
<ClCompile Include="winhelper.cpp" />
</ItemGroup>
<ItemGroup>
<None Include="cpp.hint" />
</ItemGroup>
<ItemGroup>
<ClInclude Include="..\csv.h" />
<ClInclude Include="aggregations.h" />
<ClInclude Include="gc.hpp" />
<ClInclude Include="hasher.h" />
<ClInclude Include="io.h" />
<ClInclude Include="libaquery.h" />
<ClInclude Include="priority_vector.hpp" />
<ClInclude Include="table.h" />
<ClInclude Include="types.h" />
<ClInclude Include="utils.h" />
<ClInclude Include="vector_type.hpp" />
<ClInclude Include="winhelper.h" />
</ItemGroup>
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
</Project>

@ -0,0 +1 @@
#include "table.h"

@ -0,0 +1,481 @@
// TODO: Replace `cout, printf` with sprintf&fputs and custom buffers
#ifndef _TABLE_H
#define _TABLE_H
#include "types.h"
#include "vector_type.hpp"
#include <iostream>
#include <string>
#include "io.h"
template <typename T>
class vector_type;
template <>
class vector_type<void>;
#ifdef _MSC_VER
namespace types {
enum Type_t;
template <typename T>
struct Types;
template <class T1, class T2>
struct Coercion;
}
#endif
template<typename _Ty>
class ColView;
template<typename _Ty>
class ColRef : public vector_type<_Ty>
{
public:
typedef ColRef<_Ty> Decayed_t;
const char* name;
types::Type_t ty = types::ERROR;
ColRef() : vector_type<_Ty>(0), name("") {}
ColRef(const uint32_t& size, const char* name = "") : vector_type<_Ty>(size), name(name) {}
ColRef(const char* name) : name(name) {}
void init(const char* name = "") { ty = types::Types<_Ty>::getType(); this->size = this->capacity = 0; this->container = 0; this->name = name; }
ColRef(const char* name, types::Type_t ty) : name(name), ty(ty) {}
using vector_type<_Ty>::operator[];
using vector_type<_Ty>::operator=;
ColView<_Ty> operator [](const vector_type<uint32_t>&idxs) const {
return ColView<_Ty>(*this, idxs);
}
void out(uint32_t n = 4, const char* sep = " ") const {
n = n > this->size ? this->size : n;
std::cout << '(';
if (n > 0)
{
uint32_t i = 0;
for (; i < n - 1; ++i)
std::cout << this->operator[](i) << sep;
std::cout << this->operator[](i);
}
std::cout << ')';
}
template<typename T>
ColRef<T> scast();
};
template<typename _Ty>
class ColView {
public:
typedef ColRef<_Ty> Decayed_t;
const vector_type<uint32_t>& idxs;
const ColRef<_Ty>& orig;
const uint32_t& size;
ColView(const ColRef<_Ty>& orig, const vector_type<uint32_t>& idxs) : orig(orig), idxs(idxs), size(idxs.size) {}
ColView(const ColView<_Ty>& orig, const vector_type<uint32_t>& idxs) : orig(orig.orig), idxs(idxs), size(idxs.size) {
for (uint32_t i = 0; i < size; ++i)
idxs[i] = orig.idxs[idxs[i]];
}
_Ty& operator [](const uint32_t& i) const {
return orig[idxs[i]];
}
struct Iterator_t {
const uint32_t* val;
const ColRef<_Ty>& orig;
constexpr Iterator_t(const uint32_t* val, const ColRef<_Ty>& orig) noexcept : val(val), orig(orig) {}
_Ty& operator*() { return orig[*val]; }
bool operator != (const Iterator_t& rhs) { return rhs.val != val; }
Iterator_t& operator++ () {
++val;
return *this;
}
Iterator_t operator++ (int) {
Iterator_t tmp = *this;
++val;
return tmp;
}
};
Iterator_t begin() const {
return Iterator_t(idxs.begin(), orig);
}
Iterator_t end() const {
return Iterator_t(idxs.end(), orig);
}
void out(uint32_t n = 4, const char* sep = " ") const {
n = n > size ? size : n;
std::cout<<'(';
for (uint32_t i = 0; i < n; ++i)
std::cout << this->operator[](i)<< sep;
std::cout << ')';
}
operator ColRef<_Ty>() {
auto ret = ColRef<_Ty>(size);
for (uint32_t i = 0; i < size; ++i)
ret[i] = orig[idxs[i]];
return ret;
}
};
template <template <class...> class VT, class T>
std::ostream& operator<<(std::ostream& os, const VT<T>& v)
{
v.out();
return os;
}
template <class Type>
struct decayed_impl<ColView, Type> { typedef ColRef<Type> type; };
template<typename _Ty>
template<typename T>
inline ColRef<T> ColRef<_Ty>::scast()
{
this->ty = types::Types<T>::getType();
return *(ColRef<T> *)this;
}
using uColRef = ColRef<void>;
template<class ...Types> struct TableInfo;
template<class ...Types> struct TableView;
template <long long _Index, bool order = true, class... _Types>
constexpr inline auto& get(const TableInfo<_Types...>& table) noexcept {
if constexpr (order)
return *(ColRef<std::tuple_element_t<_Index, std::tuple<_Types...>>> *) & (table.colrefs[_Index]);
else
return *(ColRef<std::tuple_element_t<-1-_Index, std::tuple<_Types...>>> *) & (table.colrefs[-1-_Index]);
}
template <long long _Index, class... _Types>
constexpr inline ColRef<std::tuple_element_t<_Index, std::tuple<_Types...>>>& get(const TableView<_Types...>& table) noexcept {
return *(ColRef<std::tuple_element_t<_Index, std::tuple<_Types...>>> *) & (table.info.colrefs[_Index]);
}
template <class V>
struct is_vector_impl<ColRef<V>> : std::true_type {};
template <class V>
struct is_vector_impl<ColView<V>> : std::true_type {};
template <class V>
struct is_vector_impl<vector_type<V>> : std::true_type {};
template<class ...Types>
struct TableView;
template<class ...Types>
struct TableInfo {
const char* name;
ColRef<void>* colrefs;
uint32_t n_cols;
typedef std::tuple<Types...> tuple_type;
void print(const char* __restrict sep, const char* __restrict end) const;
template <class ...Types2>
struct lineage_t {
TableInfo<Types...>* this_table;
TableInfo<Types2...>* table;
vector_type<uint32_t> rid;
constexpr lineage_t(TableInfo<Types...>*this_table, TableInfo<Types2...> *table)
: this_table(this_table), table(table), rid(0) {}
constexpr lineage_t() : this_table(0), table(0), rid(0) {}
template <int col>
inline auto& get(uint32_t idx) {
return get<col>(*table)[rid[idx]];
}
void emplace_back(const uint32_t& v) {
rid.emplace_back(v);
}
};
template<class ...Types2>
auto bind(TableInfo<Types2...>* table2) {
return lineage_t(this, table2);
}
template <size_t j = 0>
typename std::enable_if<j == sizeof...(Types) - 1, void>::type print_impl(const uint32_t& i, const char* __restrict sep = " ") const;
template <size_t j = 0>
typename std::enable_if<j < sizeof...(Types) - 1, void>::type print_impl(const uint32_t& i, const char* __restrict sep = " ") const;
template <size_t ...Idxs>
struct GetTypes {
typedef typename std::tuple<typename std::tuple_element<Idxs, tuple_type>::type ...> type;
};
template <size_t ...Idxs>
using getRecordType = typename GetTypes<Idxs...>::type;
TableInfo(const char* name, uint32_t n_cols);
template <int prog = 0>
inline void materialize(const vector_type<uint32_t>& idxs, TableInfo<Types...>* tbl = nullptr) { // inplace materialize
if constexpr(prog == 0) tbl = (tbl == 0 ? this : tbl);
if constexpr (prog == sizeof...(Types)) return;
else {
auto& col = get<prog>(*this);
auto new_col = decays<decltype(col)>{idxs.size};
for(uint32_t i = 0; i < idxs.size; ++i)
new_col[i] = col[idxs[i]];
get<prog>(*tbl) = new_col;
materialize<prog + 1>(idxs, tbl);
}
}
inline TableInfo<Types...>* materialize_copy(const vector_type<uint32_t>& idxs) {
auto tbl = new TableInfo<Types...>(this->name, sizeof...(Types));
materialize<0>(idxs, tbl);
return tbl;
}
template<int ...cols>
inline vector_type<uint32_t>* order_by(vector_type<uint32_t>* ord = nullptr) {
if (!ord) {
ord = new vector_type<uint32_t>(colrefs[0].size);
for (uint32_t i = 0; i < colrefs[0].size; ++i)
(*ord)[i] = i;
}
std::sort(ord->begin(), ord->end(), [this](const uint32_t& lhs, const uint32_t& rhs) {
return
std::forward_as_tuple((cols >= 0 ? get<cols, (cols >= 0)>(*this)[lhs] : -get<cols, (cols >= 0)>(*this)[lhs]) ...)
<
std::forward_as_tuple((cols >= 0 ? get<cols, (cols >= 0)>(*this)[rhs] : -get<cols, (cols >= 0)>(*this)[rhs]) ...);
});
return ord;
}
template <int ...cols>
auto order_by_view () {
return TableView<Types...>(order_by<cols...>(), *this);
}
// Print 2 -- generate printf string first, supports flattening, supports sprintf/printf/fprintf
template <int col, int ...rem_cols, class Fn, class ...__Types>
inline void print2_impl(Fn func, const uint32_t& i, const __Types& ... args) const {
using this_type = typename std::tuple_element<col, tuple_type>::type;
const auto& this_value = get<col>(*this)[i];
const auto& next = [&](auto &v) {
if constexpr (sizeof...(rem_cols) == 0)
func(args..., v);
else
print2_impl<rem_cols...>(func, i, args ..., v);
};
if constexpr (is_vector_type<this_type>)
for (int j = 0; j < this_value.size; ++j)
next(this_value[j]);
else
next(this_value);
}
std::string get_header_string(const char* __restrict sep, const char* __restrict end) const{
std::string header_string = std::string();
for (int i = 0; i < sizeof...(Types); ++i)
header_string += std::string(this->colrefs[i].name) + sep;
const size_t l_sep = strlen(sep);
if (header_string.size() - l_sep >= 0)
header_string.resize(header_string.size() - l_sep);
header_string += end + std::string(header_string.size(), '=') + end;
return header_string;
}
template <int ...cols>
void print2(const char* __restrict sep = ",", const char* __restrict end = "\n",
const vector_type<uint32_t>* __restrict view = nullptr, FILE* __restrict fp = nullptr) const {
std::string printf_string =
generate_printf_string<typename std::tuple_element<cols, tuple_type>::type ...>(sep, end);
std::string header_string = std::string();
constexpr static int a_cols[] = { cols... };
for(int i = 0; i < sizeof...(cols); ++i)
header_string += std::string(this->colrefs[a_cols[i]].name) + sep;
const size_t l_sep = strlen(sep);
if(header_string.size() - l_sep >= 0)
header_string.resize(header_string.size() - l_sep);
const auto& prt_loop = [&fp, &view, &printf_string, *this](const auto& f) {
if(view)
for (int i = 0; i < view->size; ++i)
print2_impl<cols...>(f, (*view)[i], printf_string.c_str());
else
for (int i = 0; i < colrefs[0].size; ++i)
print2_impl<cols...>(f, i, printf_string.c_str());
};
if (fp)
{
fprintf(fp, "%s%s", header_string.c_str(), end);
prt_loop([&fp](auto... args) { fprintf(fp, args...); });
}
else {
printf("%s%s", header_string.c_str(), end);
prt_loop(printf);
}
}
template <int ...vals> struct applier {
inline constexpr static void apply(const TableInfo<Types...>& t, const char* __restrict sep = ",", const char* __restrict end = "\n",
const vector_type<uint32_t>* __restrict view = nullptr, FILE* __restrict fp = nullptr)
{ t.template print2<vals ...>(sep, end, view, fp); }};
inline void printall(const char* __restrict sep = ",", const char* __restrict end = "\n",
const vector_type<uint32_t>* __restrict view = nullptr, FILE* __restrict fp = nullptr) {
applyIntegerSequence<sizeof...(Types), applier>::apply(*this, sep, end, view, fp);
}
};
template<class ...Types>
struct TableView {
const vector_type<uint32_t>* idxs;
const TableInfo<Types...>& info;
constexpr TableView(const vector_type<uint32_t>* idxs, const TableInfo<Types...>& info) noexcept : idxs(idxs), info(info) {}
void print(const char* __restrict sep, const char* __restrict end) const;
template <size_t j = 0>
typename std::enable_if<j == sizeof...(Types) - 1, void>::type print_impl(const uint32_t& i, const char* __restrict sep = " ") const;
template <size_t j = 0>
typename std::enable_if < j < sizeof...(Types) - 1, void>::type print_impl(const uint32_t& i, const char* __restrict sep = " ") const;
~TableView() {
delete idxs;
}
};
template <class T>
constexpr static inline bool is_vector(const ColRef<T>&) {
return true;
}
template <class T>
constexpr static inline bool is_vector(const vector_type<T>&) {
return true;
}
template<class ...Types>
TableInfo<Types...>::TableInfo(const char* name, uint32_t n_cols) : name(name), n_cols(n_cols) {
this->colrefs = (ColRef<void>*)malloc(sizeof(ColRef<void>) * n_cols);
}
template <class ...Types>
template <size_t j>
inline typename std::enable_if<j == sizeof...(Types) - 1, void>::type
TableView<Types ...>::print_impl(const uint32_t& i, const char* __restrict sep) const {
std::cout << (get<j>(*this))[(*idxs)[i]];
}
template<class ...Types>
template<size_t j>
inline typename std::enable_if < j < sizeof...(Types) - 1, void>::type
TableView<Types...>::print_impl(const uint32_t& i, const char* __restrict sep) const
{
std::cout << (get<j>(*this))[(*idxs)[i]] << sep;
print_impl<j + 1>(i, sep);
}
template<class ...Types>
inline void TableView<Types...>::print(const char* __restrict sep, const char* __restrict end) const {
std::string header_string = info.get_header_string(sep, end);
std::cout << header_string.c_str();
int n_rows = 0;
if (info.colrefs[0].size > 0)
n_rows = info.colrefs[0].size;
for (int i = 0; i < n_rows; ++i) {
print_impl(i);
std::cout << end;
}
}
template <class ...Types>
template <size_t j>
inline typename std::enable_if<j == sizeof...(Types) - 1, void>::type
TableInfo<Types ...>::print_impl(const uint32_t& i, const char* __restrict sep) const {
std::cout << (get<j>(*this))[i];
}
template<class ...Types>
template<size_t j>
inline typename std::enable_if<j < sizeof...(Types) - 1, void>::type
TableInfo<Types...>::print_impl(const uint32_t& i, const char* __restrict sep) const
{
std::cout << (get<j>(*this))[i] << sep;
print_impl<j+1>(i, sep);
}
template<class ...Types>
inline void TableInfo<Types...>::print(const char* __restrict sep, const char* __restrict end) const {
std::string header_string = get_header_string(sep, end);
std::cout << header_string.c_str();
int n_rows = 0;
if (n_cols > 0 && colrefs[0].size > 0)
n_rows = colrefs[0].size;
for (int i = 0; i < n_rows; ++i) {
print_impl(i);
std::cout << end;
}
}
template <class T1, class T2, template<typename ...> class VT, template<typename ...> class VT2>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator -(const VT<T1>& lhs, const VT2<T2>& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] - rhs[i];
return ret;
}
template <class T1, class T2, template<typename ...> class VT>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator -(const VT<T1>& lhs, const T2& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] - rhs;
return ret;
}
template <class T1, class T2, template<typename ...> class VT, template<typename ...> class VT2>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator +(const VT<T1>& lhs, const VT2<T2>& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] + rhs[i];
return ret;
}
template <class T1, class T2, template<typename ...> class VT>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator +(const VT<T1>& lhs, const T2& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] + rhs;
return ret;
}
template <class T1, class T2, template<typename ...> class VT, template<typename ...> class VT2>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator *(const VT<T1>& lhs, const VT2<T2>& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] * rhs[i];
return ret;
}
template <class T1, class T2, template<typename ...> class VT>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator *(const VT<T1>& lhs, const T2& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] * rhs;
return ret;
}
template <class T1, class T2, template<typename ...> class VT, template<typename ...> class VT2>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator /(const VT<T1>& lhs, const VT2<T2>& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] / rhs[i];
return ret;
}
template <class T1, class T2, template<typename ...> class VT>
decayed_t<VT, typename types::Coercion<T1, T2>::type> operator /(const VT<T1>& lhs, const T2& rhs) {
auto ret = decayed_t<VT, typename types::Coercion<T1, T2>::type>(lhs.size, "");
for (int i = 0; i < lhs.size; ++i)
ret[i] = lhs[i] / rhs;
return ret;
}
template <class ...Types>
void print(const TableInfo<Types...>& v, const char* delimiter = " ", const char* endline = "\n") {
v.print(delimiter, endline);
}
template <class ...Types>
void print(const TableView<Types...>& v, const char* delimiter = " ", const char* endline = "\n") {
v.print(delimiter, endline);
}
template <class T>
void print(const T& v, const char* delimiter = " ") {
std::cout<< v;
// printf(types::printf_str[types::Types<T>::getType()], v);
}
template <class T>
void inline print_impl(const T& v, const char* delimiter, const char* endline) {
for (const auto& vi : v) {
print(vi);
std::cout << delimiter;
// printf("%s", delimiter);
}
std::cout << endline;
//printf("%s", endline);
}
template <class T, template<typename> class VT>
typename std::enable_if<!std::is_same<VT<T>, TableInfo<T>>::value>::type
print(const VT<T>& v, const char* delimiter = " ", const char* endline = "\n") {
print_impl(v, delimiter, endline);
}
#endif

@ -0,0 +1,20 @@
#include "types.h"
#include <string>
#include <iostream>
#include <chrono>
#include <ctime>
namespace types {
using namespace std;
using namespace chrono;
string date_t::toString() const {
uint32_t curr_v = val;
tm;
time_t;
return string() + string("/") + string() + string("/") + string();
}
string time_t::toString() const {
uint32_t curr_v = val;
return string() + string("/") + string() + string("/") + string();
}
}

@ -0,0 +1,192 @@
#ifndef _TYPES_H
#define _TYPES_H
#include <typeinfo>
#include <functional>
#include <cstdint>
#include <type_traits>
#include <string>
#ifdef _MSC_VER
#define __restrict__ __restrict
#endif
template <class T>
constexpr static inline bool is_vector(const T&) {
return false;
}
template <class T>
struct is_vector_impl : std::false_type {};
template <class T>
constexpr static bool is_vector_type = is_vector_impl<T>::value;
namespace types {
enum Type_t {
AINT, AFLOAT, ASTR, ADOUBLE, ALDOUBLE, ALONG, ASHORT, ADATE, ATIME, ACHAR,
AUINT, AULONG, AUSHORT, AUCHAR, VECTOR, NONE, ERROR
};
static constexpr const char* printf_str[] = { "%d", "%f", "%s", "%lf", "%llf", "%ld", "%hi", "%s", "%s", "%c",
"%u", "%lu", "%hu", "%hhu", "Vector<%s>", "NULL", "ERROR" };
// TODO: deal with data/time <=> str/uint conversion
struct date_t {
uint32_t val;
date_t(const char* d) {
}
std::string toString() const;
};
struct time_t {
uint32_t val;
time_t(const char* d) {
}
std::string toString() const;
};
template <typename T>
struct Types {
typedef T type;
constexpr Types() noexcept = default;
#define ConnectTypes(f) \
f(int, AINT) \
f(float, AFLOAT) \
f(const char*, ASTR) \
f(double, ADOUBLE) \
f(long double, ALDOUBLE) \
f(long, ALONG) \
f(short, ASHORT) \
f(date_t, ADATE) \
f(time_t, ATIME) \
f(unsigned char, ACHAR) \
f(unsigned int, AUINT) \
f(unsigned long, AULONG) \
f(unsigned short, AUSHORT) \
f(unsigned char, AUCHAR)
inline constexpr static Type_t getType() {
#define TypeConnect(x, y) if constexpr(std::is_same<x, T>::value) return y; else
ConnectTypes(TypeConnect)
if constexpr (is_vector_type<T>)
return VECTOR;
else
return NONE;
}
};
#define ATypeSize(t, at) sizeof(t),
static constexpr size_t AType_sizes[] = { ConnectTypes(ATypeSize) 1 };
#define Cond(c, x, y) typename std::conditional<c, x, y>::type
#define Comp(o) (sizeof(T1) o sizeof(T2))
#define Same(x, y) (std::is_same_v<x, y>)
#define _U(x) std::is_unsigned<x>::value
#define Fp(x) std::is_floating_point<x>::value
template <class T1, class T2>
struct Coercion {
using t1 = Cond(Comp(<= ), Cond(Comp(== ), Cond(Fp(T1), T1, Cond(Fp(T2), T2, Cond(_U(T1), T2, T1))), T2), T1);
using t2 = Cond(Same(T1, T2), T1, Cond(Same(T1, const char*) || Same(T2, const char*), const char*, void));
using type = Cond(Same(t2, void), Cond(Same(T1, date_t) && Same(T2, time_t) || Same(T1, time_t) && Same(T2, time_t), void, t1), t2);
};
#define __Eq(x) (sizeof(T) == sizeof(x))
template<class T>
struct GetFPTypeImpl {
using type = Cond(__Eq(float), float, Cond(__Eq(double), double, long double));
};
template<class T>
using GetFPType = typename GetFPTypeImpl<typename std::decay<T>::type>::type;
template<class T>
struct GetLongTypeImpl {
using type = Cond(_U(T), unsigned long long, Cond(Fp(T), long double, long long));
};
template<class T>
using GetLongType = typename GetLongTypeImpl<typename std::decay<T>::type>::type;
}
#define getT(i, t) std::tuple_element_t<i, std::tuple<t...>>
template <template<typename ...> class T, typename ...Types>
struct applyTemplates {
using type = T<Types...>;
};
template <class lT, template <typename ...> class rT>
struct transTypes_s;
template <template<typename ...> class lT, typename ...T, template<typename ...> class rT>
struct transTypes_s<lT<T...>, rT> {
using type = rT<T...>;
};
// static_assert(std::is_same<transTypes<std::tuple<int, float>, std::unordered_map>, std::unordered_map<int, float>>::value);
template <class lT, template <typename ...> class rT>
using transTypes = typename transTypes_s<lT, rT>::type;
template <class ...Types>
struct record_types {};
template <class ...Types>
using record = std::tuple<Types...>;
template <class T>
struct decayS {
using type = typename std::decay<T>::type;
};
template<template<typename ...> class T, typename ...Types>
struct decayS <T<Types...>>{
using type = T<typename std::decay<Types>::type ...>;
};
template <class T>
using decays = typename decayS<typename std::decay<T>::type>::type;
template <class T>
using decay_inner = typename decayS<T>::type;
template <class, template <class...> class T>
struct instance_of_impl : std::false_type {};
template <class ...T1, template <class ...> class T2>
struct instance_of_impl<T2<T1...>, T2> : std::true_type {};
template <class T1, class T2>
struct same_class_impl : std::false_type {};
template <class ...T1s, class ...T2s, template <class...> class T1>
struct same_class_impl<T1<T1s...>, T1<T2s...>> : std::true_type {};
template <class T1, class T2>
bool same_class = same_class_impl<T1, T2>::value;
template <class T1, template <class...> class T2>
bool instance_of = instance_of_impl<T1, T2>::value;
template <class lT, template <typename ...> class rT>
using transTypes = typename transTypes_s<lT, rT>::type;
template <class lT, class vT, template <vT ...> class rT>
struct transValues_s;
template <class vT, template<class, vT ...> class lT, vT ...T, template<vT ...> class rT>
struct transValues_s<lT<vT, T...>, vT, rT> {
using type = rT<T...>;
};
#include <utility>
template <class vT, int i, template <vT ...> class rT>
using transValues = typename transValues_s<std::make_integer_sequence<vT, i>, vT, rT>::type;
template <int i, template <int ...> class rT>
using applyIntegerSequence = typename transValues_s<std::make_integer_sequence<int, i>, int, rT>::type;
template <template <class ...> class T, class ...Types>
struct decayed_impl{ typedef T<Types...> type;};
template <template <typename ...> class VT, class ...Types>
using decayed_t = typename decayed_impl<VT, Types...>::type;
template <class First = void, class...Rest>
struct get_first_impl {
typedef First first;
constexpr static size_t rest_len = sizeof...(Rest);
typedef get_first_impl<Rest...> rest;
};
template <class ...T>
using get_first = typename get_first_impl<T...>::first;
template <class T>
struct value_type_impl { typedef T type; };
template <template <class...> class VT, class ...V>
struct value_type_impl<VT<V...>> { typedef get_first<V...> type; };
template <class T>
using value_type = typename value_type_impl<T>::type;
template <class ...T>
using get_first = typename get_first_impl<T...>::first;
template <class T>
struct value_type_rec_impl { typedef T type; };
template <template <class...> class VT, class ...V>
struct value_type_rec_impl<VT<V...>> { typedef typename value_type_rec_impl<get_first<int>>::type type; };
template <class T>
using value_type_r = typename value_type_rec_impl<T>::type;
#endif // !_TYPES_H

@ -0,0 +1,18 @@
#include "utils.h"
#include <random>
#include <chrono>
using std::string;
string base62uuid(int l = 8) {
using namespace std;
constexpr static const char* base62alp = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
static mt19937_64 engine(chrono::system_clock::now().time_since_epoch().count());
static uniform_int_distribution<uint64_t> u(0x10000, 0xfffff);
uint64_t uuid = (u(engine) << 32ull) + (chrono::system_clock::now().time_since_epoch().count() & 0xffffffff);
printf("%llx\n", uuid);
string ret;
while (uuid && l-- >= 0) {
ret = string("") + base62alp[uuid % 62] + ret;
uuid /= 62;
}
return ret;
}

@ -0,0 +1,16 @@
#pragma once
#include <string>
#include <ctime>
#if ((defined(_MSVC_LANG) && _MSVC_LANG >= 201703L) || __cplusplus >= 201703L)
constexpr static bool cpp_17 = true;
#else
constexpr static bool cpp_17 = false;
#endif
template <class T>
inline const char* str(const T& v) {
return "";
}
template <>
inline const char* str(const bool& v) {
return v ? "true" : "false";
}

@ -0,0 +1,15 @@
#include "vector_type.hpp"
#include <iostream>
template<typename _Ty>
inline void vector_type<_Ty>::out(uint32_t n, const char* sep) const
{
n = n > size ? size : n;
std::cout << '(';
{
uint32_t i = 0;
for (; i < n - 1; ++i)
std::cout << this->operator[](i) << sep;
std::cout << this->operator[](i);
}
std::cout << ')';
}

@ -0,0 +1,278 @@
/*
* Bill Sun 2022
*/
#ifndef _VECTOR_TYPE
#define _VECTOR_TYPE
#include <cstring>
#include <cstdlib>
#include <cstdint>
#include <algorithm>
#include <initializer_list>
#include <vector>
#include <stdarg.h>
#include <limits>
#include <deque>
#include "types.h"
#pragma pack(push, 1)
template <typename _Ty>
class vector_type {
public:
typedef vector_type<_Ty> Decayed_t;
void inline _copy(const vector_type<_Ty>& vt) {
// quick init while using malloc
//if (capacity > 0) free(container);
this->size = vt.size;
this->capacity = vt.capacity;
this->container = (_Ty*)malloc(size * sizeof(_Ty));
memcpy(container, vt.container, sizeof(_Ty) * size);
}
void inline _move(vector_type<_Ty>&& vt) {
if (capacity > 0) free(container);
this->size = vt.size;
this->capacity = vt.capacity;
this->container = vt.container;
vt.size = vt.capacity = 0;
vt.container = 0;
}
public:
_Ty* container;
uint32_t size, capacity;
typedef _Ty* iterator_t;
vector_type(const uint32_t& size) : size(size), capacity(size) {
container = (_Ty*)malloc(size * sizeof(_Ty));
}
constexpr vector_type(std::initializer_list<_Ty> _l) {
size = capacity = _l.size();
_Ty* _container = this->container = (_Ty*)malloc(sizeof(_Ty) * _l.size());
for (const auto& l : _l) {
*(_container++) = l;
}
}
constexpr vector_type() noexcept : size(0), capacity(0), container(0) {};
constexpr vector_type(const vector_type<_Ty>& vt) noexcept : capacity(0) {
_copy(vt);
}
constexpr vector_type(vector_type<_Ty>&& vt) noexcept : capacity(0) {
_move(std::move(vt));
}
vector_type<_Ty> operator =(const _Ty& vt) {
if (!container) {
container = (_Ty*)malloc(sizeof(_Ty));
capacity = 1;
}
size = 1;
container[0] = vt;
return *this;
}
vector_type<_Ty> operator =(const vector_type<_Ty>& vt) {
_copy(vt);
return *this;
}
vector_type<_Ty> operator =(vector_type<_Ty>&& vt) {
_move(std::move(vt));
return *this;
}
template <template <class> class VT>
vector_type<_Ty> operator =(const VT<_Ty>& vt) {
if (capacity > 0) free(container);
container = static_cast<_Ty*>(malloc(vt.size * sizeof(_Ty)));
size = vt.size;
capacity = size;
for(uint32_t i = 0; i < size; ++i)
container[i] = vt[i];
return *this;
}
void emplace_back(_Ty _val) {
if (size >= capacity) { // geometric growth
capacity += 1 + (capacity >> 1);
_Ty* n_container = (_Ty*)malloc(capacity * sizeof(_Ty));
memcpy(n_container, container, sizeof(_Ty) * size);
free(container);
container = n_container;
}
container[size++] = _val;
}
iterator_t erase(iterator_t _it) {
#ifdef DEBUG
// Do bound checks
if (!(size && capicity && container &&
_it >= container && (_it - container) < size))
return 0;
#endif
iterator_t curr = _it + 1, end = _it + --size;
while (curr < end)
*(curr - 1) = *(curr++);
return _it;
}
iterator_t begin() const {
return container;
}
iterator_t end() const {
return container + size;
}
iterator_t find(const _Ty item) const {
iterator_t curr = begin(), _end = end();
while (curr != _end && *(curr++) != item);
return curr;
}
_Ty& operator[](const uint32_t _i) const {
return container[_i];
}
void shrink_to_fit() {
if (size && capacity != size) {
capacity = size;
_Ty* _container = (_Ty*)malloc(sizeof(_Ty) * size);
memcpy(_container, container, sizeof(_Ty) * size);
free(container);
container = _container;
}
}
_Ty& back() {
return container[size - 1];
}
void qpop() {
size = size ? size - 1 : size;
}
void pop() {
if (size) {
--size;
if (capacity > (size << 1))
{
_Ty* new_container = (_Ty*)malloc(sizeof(_Ty) * size);
memcpy(new_container, container, sizeof(_Ty) * size);
free(container);
container = new_container;
capacity = size;
}
}
}
_Ty pop_back() {
return container[--size];
}
void merge(vector_type<_Ty>& _other) {
if (capacity < this->size + _other.size)
{
_Ty* new_container = (_Ty*)malloc(sizeof(_Ty) * (_other.size + this->size));
capacity = this->size + _other.size;
memcpy(new_container, container, sizeof(_Ty) * this->size);
memcpy(new_container + this->size, _other.container, sizeof(_Ty) * _other.size);
free(container);
container = new_container;
}
else
memcpy(container + this->size, _other.container, sizeof(_Ty) * _other.size);
size = this->size + _other.size;
}
template<class _Iter>
void merge(_Iter begin, _Iter end) {
unsigned int dist = static_cast<unsigned int>(std::distance(begin, end));
if (capacity < this->size + dist) {
_Ty* new_container = (_Ty*)malloc(sizeof(_Ty) * (dist + this->size));
capacity = this->size + dist;
memcpy(new_container, container, sizeof(_Ty) * this->size);
free(container);
container = new_container;
}
for (int i = 0; i < dist; ++i) {
container[i + this->size] = *(begin + i);
}
size = this->size + dist;
}
void out(uint32_t n = 4, const char* sep = " ") const;
~vector_type() {
if (capacity > 0) free(container);
container = 0; size = capacity = 0;
}
#define Op(o, x) \
template<typename T>\
vector_type<typename types::Coercion<_Ty, T>::type> inline x(const vector_type<T>& r) const {\
vector_type<typename types::Coercion<_Ty, T>::type> ret(size);\
for (int i = 0; i < size; ++i)\
ret[i] = container[i] o r[i];\
return ret;\
}
#define Opeq(o, x) \
template<typename T>\
vector_type<typename types::Coercion<_Ty, T>::type> inline x##eq(const vector_type<T>& r) {\
for (int i = 0; i < size; ++i)\
container[i] = container[i] o##= r[i];\
return *this;\
}
#define Ops(o, x) \
template<typename T>\
vector_type<typename types::Coercion<_Ty, T>::type> operator o (const vector_type<T>& r) const {\
[[likely]] if (r.size == size) {\
return x(r);\
}\
}
#define Opseq(o, x) \
template<typename T>\
vector_type<typename types::Coercion<_Ty, T>::type> operator o##= (const vector_type<T>& r) {\
[[likely]] if (r.size == size) {\
return x##eq(r);\
}\
}
#define _Make_Ops(M) \
M(+, add) \
M(-, minus) \
M(*, multi) \
M(/, div) \
M(%, mod)
_Make_Ops(Op)
_Make_Ops(Opeq)
_Make_Ops(Ops)
_Make_Ops(Opseq)
};
template <>
class vector_type<void> {
public:
void* container;
uint32_t size, capacity;
typedef void* iterator_t;
vector_type(uint32_t size) : size(size), capacity(size) {
container = (void*)malloc(size);
}
template<typename _Ty>
constexpr vector_type(std::initializer_list<_Ty> _l) {
size = capacity = _l.size();
this->container = malloc(sizeof(_Ty) * _l.size());
_Ty* _container = (_Ty*)this->container;
for (const auto& l : _l) {
*(_container++) = l;
}
}
constexpr vector_type() : size(0), capacity(0), container(0) {};
void* get(uint32_t i, types::Type_t atype){
return static_cast<void*>(static_cast<char*>(container) + (i * types::AType_sizes[atype]));
}
void operator[](const uint32_t& i) {
}
};
#pragma pack(pop)
#endif

@ -0,0 +1,36 @@
#include "winhelper.h"
#include <Windows.h>
void* dlopen(const char* lib, int)
{
return LoadLibraryA(lib);
}
void* dlsym(void* handle, const char* proc)
{
return reinterpret_cast<void*>(GetProcAddress(static_cast<HMODULE>(handle), proc));
}
int dlclose(void* handle)
{
return FreeLibrary(static_cast<HMODULE>(handle));
}
SharedMemory::SharedMemory(const char* fname)
{
this->hFileMap = CreateFileMappingA(INVALID_HANDLE_VALUE, NULL, PAGE_READWRITE, 0, 2, fname);
if (this->hFileMap)
this->pData = MapViewOfFile(this->hFileMap, FILE_MAP_ALL_ACCESS, 0, 0, 2);
else
this->pData = NULL;
}
void SharedMemory::FreeMemoryMap()
{
if (this->hFileMap)
if (this->pData)
UnmapViewOfFile(this->pData);
if (this->hFileMap)
CloseHandle(this->hFileMap);
}

@ -0,0 +1,14 @@
#ifndef _WINHELPER_H
#define _WINHELPER_H
static constexpr int RTLD_LAZY = 1;
void* dlopen(const char*, int);
void* dlsym(void*, const char*);
int dlclose(void*);
struct SharedMemory
{
void* hFileMap;
void* pData;
SharedMemory(const char*);
void FreeMemoryMap();
};
#endif

@ -0,0 +1,32 @@
CREATE TABLE stocks(timestamp INT, price INT)
INSERT INTO stocks VALUES(1,15)
INSERT INTO stocks VALUES(2,19)
INSERT INTO stocks VALUES(3,16)
INSERT INTO stocks VALUES(4,17)
INSERT INTO stocks VALUES(5,15)
INSERT INTO stocks VALUES(6,13)
INSERT INTO stocks VALUES(7,5)
INSERT INTO stocks VALUES(8,8)
INSERT INTO stocks VALUES(9,7)
INSERT INTO stocks VALUES(10,13)
INSERT INTO stocks VALUES(11,11)
INSERT INTO stocks VALUES(12,14)
INSERT INTO stocks VALUES(13,10)
INSERT INTO stocks VALUES(14,5)
INSERT INTO stocks VALUES(15,2)
INSERT INTO stocks VALUES(16,5)
/*<k> "q1" </k>*/
SELECT max(price-min(timestamp)) FROM stocks
/*<k> "q2" </k>*/
SELECT max(price-mins(price)) FROM stocks
/*<k> "q3"</k>*/
SELECT price, timestamp FROM stocks where price - timestamp > 1 and not (price*timestamp<100)
/*<k> "q4"</k>*/
SELECT max(price-mins(price))
FROM stocks
ASSUMING DESC timestamp

@ -0,0 +1,11 @@
a, b, c, d
1,1,2,2
1,2,2,2
1,2,3,4
4,2,1,4
2,1,3,4
1,2,3,4
1,2,3,3
3,2,1,2
2,1,2,2
1,2,3,1
1 a b c d
2 1 1 2 2
3 1 2 2 2
4 1 2 3 4
5 4 2 1 4
6 2 1 3 4
7 1 2 3 4
8 1 2 3 3
9 3 2 1 2
10 2 1 2 2
11 1 2 3 1
Loading…
Cancel
Save