Compare commits

...

4 Commits

@ -4,10 +4,10 @@ MonetDB_INC =
Defines =
CXXFLAGS = --std=c++2a
ifeq ($(AQ_DEBUG), 1)
OPTFLAGS = -g3 -fsanitize=address -fsanitize=leak
OPTFLAGS = -g3 #-fsanitize=address
LINKFLAGS =
else
OPTFLAGS = -O3 -DNDEBUG -fno-stack-protector
OPTFLAGS = -Ofast -DNDEBUG -fno-stack-protector
LINKFLAGS = -flto -s
endif
SHAREDFLAGS = -shared

@ -1,31 +1,20 @@
# AQuery++ Database
### Please try the latest code in dev branch if you encounter any problem. Use `git checkout dev` to switch branches.
## Introduction
AQuery++ Database is a cross-platform, In-Memory Column-Store Database that incorporates compiled query execution. (**Note**: If you encounter any problems, feel free to contact me via ys3540@nyu.edu)
# Architecture
![Architecture](./docs/arch-hybrid.svg)
## Docker (Recommended):
- See installation instructions from [docker.com](https://www.docker.com). Run **docker desktop** to start docker engine.
- In AQuery root directory, type `make docker` to build the docker image from scratch.
- For Arm-based Mac users, you would have to build and run the **x86_64** docker image because MonetDB doesn't offer official binaries for arm64 Linux. (Run `docker buildx build --platform=linux/amd64 -t aquery .` instead of `make docker`)
- Finally run the image in **interactive** mode (`docker run --name aquery -it aquery`)
- When you need to access the container again run `docker start -ai aquery`
- If there is a need to access the system shell within AQuery, type `dbg` to activate python interpreter and type `os.system('sh')` to launch a shell.
- Docker image is available on [Docker Hub](https://hub.docker.com/repository/docker/sunyinqi0508/aquery) but building image yourself is highly recommended (see [#2](../../issues/2))
## CIMS Computer Lab (Only for NYU affiliates who have access)
1. Clone this git repo in CIMS.
2. Download the [patch](https://drive.google.com/file/d/1YkykhM6u0acZ-btQb4EUn4jAEXPT81cN/view?usp=sharing)
3. Decompress the patch to any directory and execute script inside by typing (`source ./cims.sh`). Please use the source command or `. ./cims.sh` (dot space) to execute the script because it contains configurations for environment variables. Also note that this script can only work with bash and compatible shells (e.g. dash, zsh. but not csh)
4. Execute `python3 ./prompt.py`
## AQuery Compiler
- The query is first processed by the AQuery Compiler which is composed of a frontend that parses the query into AST and a backend that generates target code that delivers the query.
- Front end of AQuery++ Compiler is built on top of [mo-sql-parsing](https://github.com/klahnakoski/mo-sql-parsing) with modifications to handle AQuery dialect and extension.
- Backend of AQuery++ Compiler generates target code dependent on the Execution Engine. It can either be the C++ code for AQuery Execution Engine or sql and C++ post-processor for Hybrid Engine or k9 for the k9 Engine.
## Execution Engines
- AQuery++ supports different execution engines thanks to the decoupled compiler structure.
- Hybrid Execution Engine: decouples the query into two parts. The sql-compliant part is executed by an Embedded version of Monetdb and everything else is executed by a post-process module which is generated by AQuery++ Compiler in C++ and then compiled and executed.
- AQuery Library: A set of header based libraries that provide column arithmetic and operations inspired by array programming languages like kdb. This library is used by C++ post-processor code which can significantly reduce the complexity of generated code, reducing compile time while maintaining the best performance. The set of libraries can also be used by UDFs as well as User modules which makes it easier for users to write simple, efficient yet powerful extensions.
## Singularity Container
1. build container `singularity build aquery.sif aquery.def`
2. execute container `singularity exec aquery.sif sh`
3. run AQuery `python3 ./prompt.py`
# Native Installation:
# Installation:
## Requirements
1. Recent version of Linux, Windows or MacOS, with recent C++ compiler that has C++17 (1z) support. (however c++20 is recommended if available for heterogeneous lookup on unordered containers)
- GCC: 9.0 or above (g++ 7.x, 8.x fail to handle fold-expressions due to a compiler bug)
@ -38,10 +27,6 @@ AQuery++ Database is a cross-platform, In-Memory Column-Store Database that inco
- On MacOS, Monetdb can be easily installed in homebrew `brew install monetdb`.
3. Python 3.6 or above and install required packages in requirements.txt by `python3 -m pip install -r requirements.txt`
## Installation
AQuery is tested on mainstream operating systems such as Windows, macOS and Linux
### Windows
There're multiple options to run AQuery on Windows. But for better consistency I recommend using a simulated Linux environment such as **Windows Subsystem for Linux** (1 or 2), **Docker** or **Linux Virtual Machines**. You can also use the native toolchain from Microsoft Visual Studio or gcc from Winlabs/Cygwin/MinGW.
@ -97,7 +82,24 @@ There're multiple options to run AQuery on Windows. But for better consistency I
In this case, upgrade anaconda or your compiler or use the python from your OS or package manager instead. Or (**NOT recommended**) copy/link the library from your system (e.g. /usr/lib/x86_64-linux-gnu/libstdc++.so.6) to anaconda's library directory (e.g. ~/Anaconda3/lib/).
## Docker (Recommended):
- See installation instructions from [docker.com](https://www.docker.com). Run **docker desktop** to start docker engine.
- In AQuery root directory, type `make docker` to build the docker image from scratch.
- For Arm-based Mac users, you would have to build and run the **x86_64** docker image because MonetDB doesn't offer official binaries for arm64 Linux. (Run `docker buildx build --platform=linux/amd64 -t aquery .` instead of `make docker`)
- Finally run the image in **interactive** mode (`docker run --name aquery -it aquery`)
- When you need to access the container again run `docker start -ai aquery`
- If there is a need to access the system shell within AQuery, type `dbg` to activate python interpreter and type `os.system('sh')` to launch a shell.
- Docker image is available on [Docker Hub](https://hub.docker.com/repository/docker/sunyinqi0508/aquery) but building image yourself is highly recommended (see [#2](../../issues/2))
## CIMS Computer Lab (Only for NYU affiliates who have access)
1. Clone this git repo in CIMS.
2. Download the [patch](https://drive.google.com/file/d/1YkykhM6u0acZ-btQb4EUn4jAEXPT81cN/view?usp=sharing)
3. Decompress the patch to any directory and execute script inside by typing (`source ./cims.sh`). Please use the source command or `. ./cims.sh` (dot space) to execute the script because it contains configurations for environment variables. Also note that this script can only work with bash and compatible shells (e.g. dash, zsh. but not csh)
4. Execute `python3 ./prompt.py`
## Singularity Container
1. build container `singularity build aquery.sif aquery.def`
2. execute container `singularity exec aquery.sif sh`
3. run AQuery `python3 ./prompt.py`
# Usage
`python3 prompt.py` will launch the interactive command prompt. The server binary will be automatically rebuilt and started.
### Commands:
@ -268,17 +270,6 @@ SELECT * FROM my_table WHERE c1 > 10
- `sqrt(x), trunc(x), and other builtin math functions`: value-wise math operations. `sqrt(x)[i] = sqrt(x[i])`
- `pack(cols, ...)`: pack multiple columns with exact same type into a single column.
# Architecture
![Architecture](./docs/arch-hybrid.svg)
## AQuery Compiler
- The query is first processed by the AQuery Compiler which is composed of a frontend that parses the query into AST and a backend that generates target code that delivers the query.
- Front end of AQuery++ Compiler is built on top of [mo-sql-parsing](https://github.com/klahnakoski/mo-sql-parsing) with modifications to handle AQuery dialect and extension.
- Backend of AQuery++ Compiler generates target code dependent on the Execution Engine. It can either be the C++ code for AQuery Execution Engine or sql and C++ post-processor for Hybrid Engine or k9 for the k9 Engine.
## Execution Engines
- AQuery++ supports different execution engines thanks to the decoupled compiler structure.
- Hybrid Execution Engine: decouples the query into two parts. The sql-compliant part is executed by an Embedded version of Monetdb and everything else is executed by a post-process module which is generated by AQuery++ Compiler in C++ and then compiled and executed.
- AQuery Library: A set of header based libraries that provide column arithmetic and operations inspired by array programming languages like kdb. This library is used by C++ post-processor code which can significantly reduce the complexity of generated code, reducing compile time while maintaining the best performance. The set of libraries can also be used by UDFs as well as User modules which makes it easier for users to write simple but powerful extensions.
# Roadmap
- [x] SQL Parser -> AQuery Parser (Front End)

@ -2,7 +2,7 @@
## GLOBAL CONFIGURATION FLAGS
version_string = '0.5.4a'
version_string = '0.6.0a'
add_path_to_ldpath = True
rebuild_backend = False
run_backend = True

@ -117,7 +117,7 @@ class build_manager:
else:
mgr.cxx = os.environ['CXX']
if 'AQ_DEBUG' not in os.environ:
os.environ['AQ_DEBUG'] = '0' if mgr.OptimizationLv else '1'
os.environ['AQ_DEBUG'] = ('0' if mgr.OptimizationLv != '0' else '1')
def libaquery_a(self):
self.build_cmd = [['rm', 'libaquery.a'],['make', 'libaquery']]
@ -184,7 +184,7 @@ class build_manager:
def __init__(self) -> None:
self.method = 'make'
self.cxx = ''
self.OptimizationLv = '0' # [O0, O1, O2, O3, Ofast]
self.OptimizationLv = '4' # [O0, O1, O2, O3, Ofast]
self.Platform = 'amd64'
self.PCH = os.environ['PCH'] if 'PCH' in os.environ else 1
self.StaticLib = 1

@ -80,7 +80,7 @@ int gen_trade_data(int argc, char* argv[])
memmove(p + lens[i], p + lens[0], (lens[i - 1] - lens[i]) * sizeof(int));
permutation(p, lens[0] + N);
// for (int i = 0; i < lens[0] + N; ++i) printf("%d ", p[i]);
FILE* fp = fopen("trade.csv", "w");
FILE* fp = fopen("trade.csv", "wb");
int* last_price = new int[N];
memset(last_price, -1, sizeof(int) * N);
fprintf(fp, "stocksymbol, time, quantity, price\n");
@ -131,7 +131,7 @@ int gen_stock_data(int argc, char* argv[]){
}
IDs[n_stocks] = "S";
names[n_stocks] = "x";
FILE* fp = fopen("./data/stock.csv", "w");
FILE* fp = fopen("./data/stock.csv", "wb");
fprintf(fp, "ID, timestamp, tradeDate, price\n");
char date_str_buf [types::date_t::string_length()];
int* timestamps = new int[n_data];
@ -142,7 +142,7 @@ int gen_stock_data(int argc, char* argv[]){
fprintf(fp, "%s,%d,%s,%d\n", IDs[ui(engine)%(n_stocks + 1)].c_str(), timestamps[i], date, ui(engine) % 1000);
}
fclose(fp);
fp = fopen("./data/base.csv", "w");
fp = fopen("./data/base.csv", "wb");
fprintf(fp, "ID, name\n");
for(int i = 0; i < n_stocks + 1; ++ i){
fprintf(fp, "%s,%s\n", IDs[i].c_str(), names[i].c_str());

@ -110,7 +110,7 @@ class outfile(ast_node):
filename = node['loc']['literal'] if 'loc' in node else node['literal']
sep = ',' if 'term' not in node else node['term']['literal']
file_pointer = 'fp_' + base62uuid(6)
self.emit(f'FILE* {file_pointer} = fopen("{filename}", "w");')
self.emit(f'FILE* {file_pointer} = fopen("{filename}", "wb");')
self.emit(f'{out_table.cxt_name}->printall("{sep}", "\\n", nullptr, {file_pointer});')
self.emit(f'fclose({file_pointer});')
# self.context.headers.add('fstream')

@ -0,0 +1,72 @@
#include "./server/libaquery.h"
#ifndef __AQ_USE_THREADEDGC__
#include "./server/gc.h"
__AQEXPORT__(void) __AQ_Init_GC__(Context* cxt) {
GC::gc_handle = static_cast<GC*>(cxt->gc);
}
#else // __AQ_USE_THREADEDGC__
#define __AQ_Init_GC__(x)
#endif // __AQ_USE_THREADEDGC__
#include "./server/hasher.h"
#include "./server/monetdb_conn.h"
#include "./server/aggregations.h"
__AQEXPORT__(int) dll_2Cxoox(Context* cxt) {
using namespace std;
using namespace types;
auto server = static_cast<Server*>(cxt->alt_server);
auto len_4ycjiV = server->cnt;
auto mont_8AE = ColRef<const char*>(len_4ycjiV, server->getCol(0));
auto sales_2RB = ColRef<int>(len_4ycjiV, server->getCol(1));
const char* names_6pIt[] = {"mont", "minw2ysales"};
auto out_2LuaMH = new TableInfo<const char*,vector_type<double>>("out_2LuaMH", names_6pIt);
decltype(auto) col_EeW23s = out_2LuaMH->get_col<0>();
decltype(auto) col_5gY1Dm = out_2LuaMH->get_col<1>();
typedef record<decays<decltype(mont_8AE)::value_t>> record_typegj3e8Xf;
ankerl::unordered_dense::map<record_typegj3e8Xf, uint32_t, transTypes<record_typegj3e8Xf, hasher>> gMzMTEvd;
gMzMTEvd.reserve(mont_8AE.size);
uint32_t* reversemap = new uint32_t[mont_8AE.size<<1],
*mapbase = reversemap + mont_8AE.size;
for (uint32_t i2E = 0; i2E < mont_8AE.size; ++i2E){
reversemap[i2E] = gMzMTEvd.hashtable_push(forward_as_tuple(mont_8AE[i2E]));
}
auto arr_values = gMzMTEvd.values().data();
auto arr_len = gMzMTEvd.size();
uint32_t* seconds = new uint32_t[gMzMTEvd.size()];
auto vecs = static_cast<vector_type<uint32_t>*>(malloc(sizeof(vector_type<uint32_t>) * arr_len));
vecs[0].init_from(arr_values[0].second, mapbase);
for (uint32_t i = 1; i < arr_len; ++i) {
vecs[i].init_from(arr_values[i].second, mapbase + arr_values[i - 1].second);
arr_values[i].second += arr_values[i - 1].second;
}
for (uint32_t i = 0; i < mont_8AE.size; ++i) {
auto id = reversemap[i];
mapbase[--arr_values[id].second] = i;
}
col_EeW23s.reserve(gMzMTEvd.size());
col_5gY1Dm.reserve(gMzMTEvd.size());
auto buf_col_5gY1Dm = new double[mont_8AE.size];
for (uint32_t i = 0; i < arr_len; ++i) {
col_5gY1Dm[i].init_from(vecs[i].size, buf_col_5gY1Dm + arr_values[i].second);
}
for (uint32_t i = 0; i < arr_len; ++i) {
auto &key_3iNX3qG = arr_values[i].first;
auto &val_7jjv8Mo = arr_values[i].second;
col_EeW23s.emplace_back(get<0>(key_3iNX3qG));
avgw(10, sales_2RB[vecs[i]], col_5gY1Dm[i]);
}
//print(*out_2LuaMH);
//FILE* fp_5LQeym = fopen("flatten.csv", "wb");
out_2LuaMH->printall(",", "\n", nullptr, nullptr, 10);
//fclose(fp_5LQeym);
puts("done.");
return 0;
}

@ -0,0 +1,51 @@
import struct
import readline
from typing import List
name : str = input()
def write():
s : str = input()
qs : List[str] = []
while(len(s) and not s.startswith('S')):
qs.append(s)
s = input()
ms : int = int(input())
with open(f'./procedures/{name}.aqp', 'wb') as fp:
fp.write(struct.pack("I", len(qs) + (ms > 0)))
fp.write(struct.pack("I", ms))
if (ms > 0):
fp.write(b'N\x00')
for q in qs:
fp.write(q.encode('utf-8'))
if q.startswith('Q'):
fp.write(b'\n ')
fp.write(b'\x00')
def read():
with open(f'./procedures/{name}.aqp', 'rb') as fp:
nq = struct.unpack("I", fp.read(4))[0]
ms = struct.unpack("I", fp.read(4))[0]
qs = fp.read().split(b'\x00')
print(f'Procedure {name}, {nq} queries, {ms} modules:')
for q in qs:
print(' ' + q.decode('utf-8'))
if __name__ == '__main__':
while True:
cmd = input("r for read, w for write: ")
if cmd.lower().startswith('r'):
read()
break
elif cmd.lower().startswith('w'):
write()
break
elif cmd.lower().startswith('q'):
break

@ -1300,7 +1300,7 @@ class outfile(ast_node):
filename = self.node['loc']['literal'] if 'loc' in self.node else self.node['literal']
sep = ',' if 'term' not in self.node else self.node['term']['literal']
file_pointer = 'fp_' + base62uuid(6)
self.addc(f'FILE* {file_pointer} = fopen("{filename}", "w");')
self.addc(f'FILE* {file_pointer} = fopen("{filename}", "wb");')
self.addc(f'{self.parent.out_table.contextname_cpp}->printall("{sep}", "\\n", nullptr, {file_pointer});')
self.addc(f'fclose({file_pointer});')
self.context.ccode += self.ccode

@ -4,6 +4,7 @@
#include <string>
#include <limits>
#include <cstring>
#include <string_view>
template <class ...Types>
std::string generate_printf_string(const char* sep = " ", const char* end = "\n") {
std::string str;
@ -25,6 +26,11 @@ inline decltype(auto) print_hook<bool>(const bool& v) {
return v? "true" : "false";
}
template<>
inline decltype(auto) print_hook<std::string_view>(const std::string_view& v) {
return v.data();
}
extern char* gbuf;
void setgbuf(char* buf = 0);

@ -55,6 +55,7 @@ void print<bool>(const bool&v, const char* delimiter){
std::cout<< (v?"true":"false") << delimiter;
}
template<class T>
T getInt(const char*& buf){
T ret = 0;
@ -537,4 +538,11 @@ aq_to_chars<types::timestamp_t>(void* value, char* buffer) {
return buffer;
}
template<>
char*
aq_to_chars<std::string_view>(void* value, char* buffer){
const auto& src = *static_cast<std::string_view*>(value);
memcpy(buffer, src.data(), src.size());
return buffer + src.size();
}

@ -161,6 +161,7 @@ template<> char* aq_to_chars<char*>(void* , char*);
template<> char* aq_to_chars<types::date_t>(void* , char*);
template<> char* aq_to_chars<types::time_t>(void* , char*);
template<> char* aq_to_chars<types::timestamp_t>(void* , char*);
template<> char* aq_to_chars<std::string_view>(void* , char*);
typedef int (*code_snippet)(void*);
template <class _This_Struct>

@ -6,6 +6,8 @@
#include "monetdb_conn.h"
#include "monetdbe.h"
#include "table.h"
#include <thread>
#undef ERROR
#undef static_assert
@ -86,7 +88,10 @@ void Server::connect(Context *cxt){
}
server = (monetdbe_database*)malloc(sizeof(monetdbe_database));
auto ret = monetdbe_open(server, nullptr, nullptr);
monetdbe_options ops;
AQ_ZeroMemory(ops);
ops.nr_threads = std::thread::hardware_concurrency();
auto ret = monetdbe_open(server, nullptr, &ops);
if (ret == 0){
status = true;
this->server = server;

@ -191,6 +191,21 @@ constexpr prt_fn_t monetdbe_prtfns[] = {
aq_to_chars<std::nullptr_t>
};
#ifndef __AQ_USE_THREADEDGC__
void aq_init_gc(void *handle, Context* cxt)
{
typedef void (*aq_gc_init_t) (Context*);
if (handle && cxt){
auto sym = dlsym(handle, "__AQ_Init_GC__");
if(sym){
((aq_gc_init_t)sym)(cxt);
}
}
}
#else //__AQ_USE_THREADEDGC__
#define aq_init_gc(h, c)
#endif //__AQ_USE_THREADEDGC__
#include "monetdbe.h"
#undef max
#undef min
@ -363,12 +378,7 @@ start:
recorded_queries.emplace_back(copy_lpstr("N"));
}
handle = dlopen(proc_name, RTLD_NOW);
#ifndef __AQ_USE_THREADEDGC__
{
typedef void (*aq_gc_init_t) (Context*);
((aq_gc_init_t)dlsym(handle, "__AQ_Init_GC__"))(cxt);
}
#endif
aq_init_gc(handle, cxt);
if (procedure_recording) {
recorded_libraries.emplace_back(handle);
}
@ -474,11 +484,13 @@ start:
p.__rt_loaded_modules = static_cast<void**>(
malloc(sizeof(void*) * p.postproc_modules));
for(uint32_t j = 0; j < p.postproc_modules; ++j){
auto pj = dlopen(p.name, RTLD_NOW);
auto pj = dlopen((procedure_root + p.name + std::to_string(j) + ".so").c_str(), RTLD_NOW);
if (pj == nullptr){
printf("Error: failed to load module %s\n", p.name);
return true;
}
aq_init_gc(pj, cxt);
p.__rt_loaded_modules[j] = pj;
}
}
@ -503,6 +515,7 @@ start:
};
const auto& load_proc_fromfile = [&](StoredProcedure& p) {
auto config_name = procedure_root + p.name + ".aqp";
puts(p.name);
auto fp = fopen(config_name.c_str(), "rb");
if(fp == nullptr){
puts("ERROR: Procedure not found on disk.");
@ -517,14 +530,17 @@ start:
p.queries = static_cast<char**>(malloc(sizeof(char*) * p.cnt));
p.queries[0] = static_cast<char*>(malloc(sizeof(char) * queries_size));
fread(&p.queries[0], queries_size, 1, fp);
fread(p.queries[0], 1, queries_size, fp);
for(uint32_t j = 1; j < p.cnt; ++j){
p.queries[j] = p.queries[j-1];
while(*p.queries[j] != '\0')
while(*(p.queries[j]) != '\0')
++p.queries[j];
++p.queries[j];
puts(p.queries[j-1]);
}
fclose(fp);
p.__rt_loaded_modules = 0;
return load_modules(p);
};
switch(n_recvd[i][1]){
@ -553,18 +569,22 @@ start:
auto _proc = cxt->stored_proc.find(proc_name);
if (_proc == cxt->stored_proc.end()){
printf("Procedure %s not found. Trying load from disk.\n", proc_name);
if (load_proc_fromfile(current_procedure)){
current_procedure.name = copy_lpstr(proc_name);
if (!load_proc_fromfile(current_procedure)){
cxt->stored_proc.insert_or_assign(proc_name, current_procedure);
}
else {
continue;
}
}
else{
current_procedure = _proc->second;
n_recv = current_procedure.cnt;
n_recvd = current_procedure.queries;
load_modules(current_procedure);
procedure_replaying = true;
goto start; // yes, I know, refactor later!!
}
n_recv = current_procedure.cnt;
n_recvd = current_procedure.queries;
load_modules(current_procedure);
procedure_replaying = true;
goto start; // yes, I know, refactor later!!
}
break;
case 'D': // delete procedure
@ -572,6 +592,9 @@ start:
case 'S': //save procedure
break;
case 'L': //load procedure
if (!load_proc_fromfile(current_procedure)) {
cxt->stored_proc.insert_or_assign(proc_name, current_procedure);
}
break;
case 'd': // display all procedures
for(const auto& p : cxt->stored_proc){

@ -10,6 +10,7 @@
#include <algorithm>
#include <cstdarg>
#include <vector>
#include <string_view>
#include "io.h"
#include "hasher.h"
@ -289,6 +290,7 @@ public:
uint32_t len = end - start;
return ColView<_Ty>(orig, idxs.subvec(start, end));
}
ColRef<_Ty> subvec_deep(uint32_t start, uint32_t end) const {
uint32_t len = end - start;
ColRef<_Ty> subvec(len);
@ -473,7 +475,7 @@ struct TableInfo {
std::string printf_string =
generate_printf_string<typename std::tuple_element<cols, tuple_type>::type ...>(sep, end);
// puts(printf_string.c_str());
puts(printf_string.c_str());
std::string header_string = std::string();
constexpr static int a_cols[] = { cols... };
if (fp == nullptr){
@ -487,6 +489,7 @@ struct TableInfo {
if (header_string.size() - l_sep >= 0)
header_string.resize(header_string.size() - l_sep);
}
const auto& prt_loop = [&fp, &view, &printf_string, *this, &limit](const auto& f) {
#ifdef __AQ__HAS__INT128__
constexpr auto num_hge = count_type<__int128_t, __uint128_t>((tuple_type*)(0));
@ -919,6 +922,7 @@ template <class ...Types>
void print(const TableView<Types...>& v, const char* delimiter = " ", const char* endline = "\n") {
v.print(delimiter, endline);
}
template <class T>
void print(const T& v, const char* delimiter = " ") {
std::cout << v << delimiter;
@ -933,7 +937,6 @@ void print<__uint128_t>(const __uint128_t& v, const char* delimiter);
#endif
template <>
void print<bool>(const bool& v, const char* delimiter);
template <class T>
void inline print_impl(const T& v, const char* delimiter, const char* endline) {
for (const auto& vi : v) {

@ -13,7 +13,7 @@ long long testing_throughput(uint32_t n_jobs, bool prompt = true){
auto tp = ThreadPool(thread::hardware_concurrency());
getchar();
auto i = 0u;
fp = fopen("tmp.tmp", "w");
fp = fopen("tmp.tmp", "wb");
auto time = chrono::high_resolution_clock::now();
while(i++ < n_jobs) tp.enqueue_task({ [](void* f) {fprintf(fp, "%d ", *(int*)f); free(f); }, new int(i) });
puts("done dispatching.");
@ -53,7 +53,7 @@ long long testing_transaction(uint32_t n_burst, uint32_t n_batch,
}
long long testing_destruction(bool prompt = true){
fp = fopen("tmp.tmp", "w");
fp = fopen("tmp.tmp", "wb");
if (prompt) {
puts("Press any key to start.");
getchar();

@ -3,6 +3,9 @@
#include <cstdint>
#include <type_traits>
#include <tuple>
#include <string_view>
#include <string>
#include <utility>
using std::size_t;
#if defined(__SIZEOF_INT128__) and not defined(_WIN32)
@ -13,6 +16,9 @@ using std::size_t;
#define __restrict__ __restrict
#endif
template<class T>
struct vector_base {};
template <class T>
constexpr static inline bool is_vector(const T&) {
return false;
@ -32,23 +38,23 @@ struct aqis_same_impl {
std::conditional_t<
std::is_same_v<T1, bool> || std::is_same_v<T2, bool>,
std::bool_constant<std::is_same_v<T1, bool> && std::is_same_v<T2, bool>>,
Cond(
(std::is_same_v<T1, bool> && std::is_same_v<T2, bool>),
std::true_type,
std::false_type
),
Cond(
std::is_signed_v<T1> == std::is_signed_v<T2>,
Cond(
std::is_floating_point_v<T1> == std::is_floating_point_v<T2>,
Cond(
aq_szof<T1> == aq_szof<T2>, // deal with sizeof(void)
std::true_type,
std::false_type
),
std::false_type
),
std::false_type
!(std::is_class_v<T1> || std::is_class_v<T2>),
Cond(
std::is_signed_v<T1> == std::is_signed_v<T2>,
Cond(
std::is_floating_point_v<T1> == std::is_floating_point_v<T2>,
std::bool_constant<aq_szof<T1> == aq_szof<T2>>, // deal with sizeof(void)
std::false_type
),
std::false_type
),
Cond(
(std::is_class_v<T1> && std::is_class_v<T2>),
std::bool_constant<(std::is_base_of_v<T1, T2> || std::is_base_of_v<T2, T1>)>,
std::false_type
)
)
>::value;
};
@ -63,12 +69,12 @@ constexpr bool aqis_same<T1, T2> = aqis_same_impl<T1, T2>::value;
namespace types {
enum Type_t {
AINT32, AFLOAT, ASTR, ADOUBLE, ALDOUBLE, AINT64, AINT128, AINT16, ADATE, ATIME, AINT8,
AUINT32, AUINT64, AUINT128, AUINT16, AUINT8, ABOOL, VECTOR, ATIMESTAMP, ACHAR, NONE, ERROR
AUINT32, AUINT64, AUINT128, AUINT16, AUINT8, ABOOL, VECTOR, ATIMESTAMP, ACHAR, ASV, NONE, ERROR
};
static constexpr const char* printf_str[] = { "%d", "%f", "%s", "%lf", "%Lf", "%ld", "%d", "%hi", "%s", "%s", "%hhd",
"%u", "%lu", "%s", "%hu", "%hhu", "%s", "%s", "Vector<%s>", "%s", "%c", "NULL", "ERROR" };
static constexpr const char* printf_str[] = { "%d", "%f", "%s", "%lf", "%Lf", "%ld", "%s", "%hi", "%s", "%s", "%hhd",
"%u", "%lu", "%s", "%hu", "%hhu", "%s", "Vector<%s>", "%s", "%c", "%s", "NULL", "ERROR" };
static constexpr const char* SQL_Type[] = { "INT", "REAL", "TEXT", "DOUBLE", "DOUBLE", "BIGINT", "HUGEINT", "SMALLINT", "DATE", "TIME", "TINYINT",
"INT", "BIGINT", "HUGEINT", "SMALLINT", "TINYINT", "BOOL", "HUGEINT", "TIMESTAMP", "CHAR", "NULL", "ERROR"};
"INT", "BIGINT", "HUGEINT", "SMALLINT", "TINYINT", "BOOL", "HUGEINT", "TIMESTAMP", "CHAR", "TEXT", "NULL", "ERROR"};
// TODO: deal with data/time <=> str/uint conversion
@ -169,6 +175,8 @@ namespace types {
f(unsigned short, AUINT16) \
f(bool, ABOOL) \
f(timestamp_t, ATIMESTAMP) \
f(std::string_view, ASV) \
f(std::string, ASV) \
F_INT128(f)
inline constexpr static Type_t getType() {
@ -399,7 +407,6 @@ struct transValues_s<lT<vT, T...>, vT, rT> {
using type = rT<T...>;
};
#include <utility>
template <class vT, int i, template <vT ...> class rT>
using transValues = typename transValues_s<std::make_integer_sequence<vT, i>, vT, rT>::type;
template <int i, template <int ...> class rT>
@ -427,8 +434,17 @@ template <class ...T>
using get_first = typename get_first_impl<T...>::first;
template <class T>
struct value_type_rec_impl { typedef T type; };
template <template <class...> class VT, class ...V>
struct value_type_rec_impl<VT<V...>> { typedef typename value_type_rec_impl<get_first<V...>>::type type; };
struct value_type_rec_impl<VT<V...>> {
typedef typename
std::conditional_t<
std::is_base_of_v<vector_base<get_first<V...>>, VT<V...>>,
typename value_type_rec_impl<get_first<V...>>::type,
VT<V...>
> type;
};
template <class T>
using value_type_r = typename value_type_rec_impl<T>::type;

@ -1059,15 +1059,13 @@ public:
return do_insert_or_assign(std::move(key), std::forward<M>(mapped)).first;
}
template <class K>
unsigned hashtable_push(K&& key) {
auto it_isinserted = try_emplace(std::forward<K>(key), 1);
if (!it_isinserted.second) {
++ it_isinserted.first->second;
return static_cast<unsigned>(it_isinserted.first - begin());
}
return static_cast<unsigned>(end() - begin() - 1);
}
// template <class K>
// bool hashtable_push(K&& key) {
// auto it_isinserted = try_emplace(std::forward<K>(key), 1);
// if (!it_isinserted.second)
// ++ it_isinserted.first->second;
// return it_isinserted.second;
// }
template <typename K,
typename M,
@ -1112,6 +1110,66 @@ public:
return {begin() + static_cast<difference_type>(value_idx), true};
}
template <class K,
typename Q = T,
typename H = Hash,
typename KE = KeyEqual,
std::enable_if_t<!is_map_v<Q> && is_transparent_v<H, KE>, bool> = true>
auto hashtable_push(K&& key) -> unsigned {
if (is_full()) {
increase_size();
}
auto hash = mixed_hash(key);
auto dist_and_fingerprint = dist_and_fingerprint_from_hash(hash);
auto bucket_idx = bucket_idx_from_hash(hash);
while (dist_and_fingerprint <= at(m_buckets, bucket_idx).m_dist_and_fingerprint) {
if (dist_and_fingerprint == at(m_buckets, bucket_idx).m_dist_and_fingerprint &&
m_equal(key, m_values[at(m_buckets, bucket_idx).m_value_idx])) {
// found it, return without ever actually creating anything
return static_cast<uint32_t>(at(m_buckets, bucket_idx).m_value_idx);
}
dist_and_fingerprint = dist_inc(dist_and_fingerprint);
bucket_idx = next(bucket_idx);
}
// value is new, insert element first, so when exception happens we are in a valid state
m_values.emplace_back(std::forward<K>(key));
// now place the bucket and shift up until we find an empty spot
auto value_idx = static_cast<value_idx_type>(m_values.size() - 1);
place_and_shift_up({dist_and_fingerprint, value_idx}, bucket_idx);
return static_cast<uint32_t>(value_idx);
}
template <class... Args>
auto hashtable_push(Args&&... args) -> unsigned {
if (is_full()) {
increase_size();
}
// we have to instantiate the value_type to be able to access the key.
// 1. emplace_back the object so it is constructed. 2. If the key is already there, pop it later in the loop.
auto& key = get_key(m_values.emplace_back(std::forward<Args>(args)...));
auto hash = mixed_hash(key);
auto dist_and_fingerprint = dist_and_fingerprint_from_hash(hash);
auto bucket_idx = bucket_idx_from_hash(hash);
while (dist_and_fingerprint <= at(m_buckets, bucket_idx).m_dist_and_fingerprint) {
if (dist_and_fingerprint == at(m_buckets, bucket_idx).m_dist_and_fingerprint &&
m_equal(key, get_key(m_values[at(m_buckets, bucket_idx).m_value_idx]))) {
m_values.pop_back(); // value was already there, so get rid of it
return static_cast<uint32_t>(at(m_buckets, bucket_idx).m_value_idx);
}
dist_and_fingerprint = dist_inc(dist_and_fingerprint);
bucket_idx = next(bucket_idx);
}
// value is new, place the bucket and shift up until we find an empty spot
auto value_idx = static_cast<value_idx_type>(m_values.size() - 1);
place_and_shift_up({dist_and_fingerprint, value_idx}, bucket_idx);
return static_cast<uint32_t>(value_idx);
}
template <class... Args>
auto emplace(Args&&... args) -> std::pair<iterator, bool> {
if (is_full()) {

@ -17,8 +17,6 @@
#include "types.h"
#include "gc.h"
#pragma pack(push, 1)
template<class T>
struct vector_base {};
struct vectortype_cstorage{
void* container;
@ -32,7 +30,6 @@ public:
void inline _copy(const vector_type<_Ty>& vt) {
// quick init while using malloc
//if (capacity > 0) free(container);
this->size = vt.size;
this->capacity = vt.capacity;
if (capacity) {
@ -72,6 +69,8 @@ public:
}
constexpr vector_type() noexcept : size(0), capacity(0), container(0) {};
constexpr vector_type(_Ty* container, uint32_t len) noexcept : size(len), capacity(0), container(container) {};
constexpr vector_type(const char** container, uint32_t len,
typename std::enable_if_t<!std::is_same_v<_Ty, const char*>>* = nullptr) noexcept = delete;
constexpr explicit vector_type(const vector_type<_Ty>& vt) noexcept : capacity(0) {
_copy(vt);
}
@ -331,7 +330,7 @@ public:
inline vector_type<_Ty> subvec_deep(uint32_t start = 0) const { return subvec_deep(start, size); }
vector_type<_Ty> getRef() { return vector_type<_Ty>(container, size); }
~vector_type() {
if (capacity > 0);// GC::gc_handle->reg(container, sizeof(_Ty) * capacity);//free(container);
if (capacity > 0) GC::gc_handle->reg(container, sizeof(_Ty) * capacity);//free(container);
container = 0; size = capacity = 0;
}
#define Compare(_op) \
@ -395,6 +394,39 @@ public:
_Make_Ops(Opseq)
};
template <>
constexpr vector_type<std::string_view>::vector_type(const char** container, uint32_t len,
typename std::enable_if_t<true>*) noexcept
{
size = capacity = len;
this->container = static_cast<std::string_view*>(
malloc(sizeof(std::string_view) * len));
for(uint32_t i = 0; i < len; ++i){
this->container[i] = container[i];
}
}
template<>
constexpr vector_type<std::string_view>::vector_type(const uint32_t size, void* data) :
size(size), capacity(0) {
this->container = static_cast<std::string_view*>(
malloc(sizeof(std::string_view) * size));
for(uint32_t i = 0; i < size; ++i){
this->container[i] = ((const char**)data)[i];
}
//std::cout<<size << container[1];
}
// template<>
// void vector_type<std::string_view>::init_from(const uint32_t size, void* data) {
// this->size = this->capacity = size;
// this->container = static_cast<std::string_view*>(
// malloc(sizeof(std::string_view) * size));
// for(uint32_t i = 0; i < size; ++i){
// this->container[i] = container[i];
// }
// }
template <>
class vector_type<void> {

Loading…
Cancel
Save